Standard 

Reference  Materials: 

HANDBOOK 
FOR  SRM 
USERS 


100 
U57 

260-100 
1985 
c  2 


NBS  Special 
Publication 

260-100 


US.  DEPARTMENT  OF 
COMMERCE 
National  Bureau  of 
Standards 


T 

M  he  National  Bureau  of  Standards'  was  established  by  an  act  of  Congress  on  March  3,  1901.  The 
M  Bureau's  overall  goal  is  to  strengthen  and  advance  the  nation's  science  and  technology  and  facilitate 
their  effective  application  for  public  benefit.  To  this  end,  the  Bureau  conducts  research  and  provides:  (1)  a 
basis  for  the  nation's  physical  measurement  system,  (2)  scientific  and  technological  services  for  industry  and 
government,  (3)  a  technical  basis  for  equity  in  trade,  and  (4)  technical  services  to  promote  public  safety. 
The  Bureau's  technical  work  is  performed  by  the  National  Measurement  Laboratory,  the  National 
Engineering  Laboratory,  the  Institute  for  Computer  Sciences  and  Technology,  and  the  Institute  for  Materials 
Science  and  Engineering . 


The  National  Measurement  Laboratory 


Provides  the  national  system  of  physical  and  chemical  measurement; 
coordinates  the  system  with  measurement  systems  of  other  nations  and 
furnishes  essential  services  leading  to  accurate  and  uniform  physical  and 
chemical  measurement  throughout  the  Nation's  scientific  community,  in- 
dustry, and  commerce;  provides  advisory  and  research  services  to  other 
Government  agencies;  conducts  physical  and  chemical  research;  develops, 
produces,  and  distributes  Standard  Reference  Materials;  and  provides 
calibration  services.  The  Laboratory  consists  of  the  following  centers: 


Basic  Standards 
Radiation  Research 
Chemical  Physics 
Analytical  Chemistry 


The  National  Engineering  Laboratory 


Provides  technology  and  technical  services  to  the  public  and  private  sectors  to 
address  national  needs  and  to  solve  national  problems;  conducts  research  in 
engineering  and  applied  science  in  support  of  these  efforts;  builds  and  main- 
tains competence  in  the  necessary  disciplines  required  to  carry  out  this 
research  and  technical  service;  develops  engineering  data  and  measurement 
capabilities;  provides  engineering  measurement  traceability  services;  develops 
test  methods  and  proposes  engineering  standards  and  code  changes;  develops 
and  proposes  new  engineering  practices;  and  develops  and  improves 
mechanisms  to  transfer  results  of  its  research  to  the  ultimate  user.  The 
Laboratory  consists  of  the  following  centers: 


Applied  Mathematics 
Electronics  and  Electrical 
Engineering2 

Manufacturing  Engineering 
Building  Technology 
Fire  Research 
Chemical  Engineering2 


The  Institute  for  Computer  Sciences  and  Technology 


Conducts  research  and  provides  scientific  and  technical  services  to  aid 
Federal  agencies  in  the  selection,  acquisition,  application,  and  use  of  com- 
puter technology  to  improve  effectiveness  and  economy  in  Government 
operations  in  accordance  with  Public  Law  89-306  (40  U.S.C.  759),  relevant 
Executive  Orders,  and  other  directives;  carries  out  this  mission  by  managing 
the  Federal  Information  Processing  Standards  Program,  developing  Federal 
ADP  standards  guidelines,  and  managing  Federal  participation  in  ADP 
voluntary  standardization  activities;  provides  scientific  and  technological  ad- 
visory services  and  assistance  to  Federal  agencies;  and  provides  the  technical 
foundation  for  computer-related  policies  of  the  Federal  Government.  The  In- 
stitute consists  of  the  following  centers: 


Programming  Science  and 
Technology 
Computer  Systems 
Engineering 


The  Institute  for  Materials  Science  and  Engineering 


Conducts  research  and  provides  measurements,  data,  standards,  reference 
materials,  quantitative  understanding  and  other  technical  information  funda- 
mental to  the  processing,  structure,  properties  and  performance  of  materials; 
addresses  the  scientific  basis  for  new  advanced  materials  technologies;  plans 
research  around  cross-country  scientific  themes  such  as  nondestructive 
evaluation  and  phase  diagram  development;  oversees  Bureau-wide  technical 
programs  in  nuclear  reactor  radiation  research  and  nondestructive  evalua- 
tion; and  broadly  disseminates  generic  technical  information  resulting  from 
its  programs.  The  Institute  consists  of  the  following  Divisions: 


Inorganic  Materials 

Fracture  and  Deformation3 

Polymers 

Metallurgy 

Reactor  Radiation 


'Headquarters  and  Laboratories  at  Gaithersburg,  MD,  unless  otherwise  noted;  mailing  address 
Gaithersburg,  MD  20899. 

2Some  divisions  within  the  center  are  located  at  Boulder,  CO  80303. 
3Located  at  Boulder,  CO,  with  some  elements  at  Gaithersburg,  MD. 
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PREFACE 


Standard  Reference  Materials  (SRM's)  as  defined  by  the  National  Bureau  of  Standards  (NBS) 
are  well-characterized  materials  produced  in  quantity  and  certified  for  one  or  more  physical 
or  chemical  properties.  They  are  used  to  assure  the  accuracy  and  compatibility  of  measure- 
ments throughout  the  Nation.  SRM's  are  widely  used  as  primary  standards  in  many  diverse  fields 
in  science,  industry,  and  technology,  both  within  the  United  States  and  throughout  the  world. 
They  are  also  used  extensively  in  the  fields  of  environmental  and  clincial  analysis.  In  many 
aplications,  traceability  of  quality  control  and  measurement  processes  to  the  national  mea- 
surement system  are  carried  out  through  the  mechanism  and  use  of  SRM's.  For  many  of  the 
Nation's  scientists  and  technologists,  it  is  therefore  of  more  than  passing  interest  to  known 
the  details  of  the  measurements  made  at  NBS  in  arriving  at  the  certified  values  of  the  SRM's 
produced.  An  NBS  series  of  papers,  of  which  this  publication  is  a  member,  called  the  NBS 
Special  Publication  -  260  Series,    is  reserved  for  this  purpose. 

This  260  Series  is  dedicated  to  the  dissemination  of  information  on  different  phases  of 
the  preparation,  measurement,  certification  and  use  of  NBS-SRM's.  In  general,  much  more 
detail  will  be  found  in  these  papers  than  is  generally  allowed,  or  desirable,  in  scientific 
journal  articles.  This  enables  the  user  to  assess  the  validity  and  accuracy  of  the  measure- 
ment process  employed,  to  judge  the  statistical  analysis,  and  to  learn  details  of  techniques 
and  methods  utilized  for  work  entailing  the  greatest  care  and  accuracy.  These  papers  also 
should  provide  sufficient  additional  information  not  found  on  the  certificate  so  that  new 
applications  in  diverse  fields  not  foreseen  at  the  time  the  SRM  was  originally  issued  to  be 
sought  and  found. 

Inquiries  concerning  the  technical  content  of  this  paper  should  be  directed  to  the  author. 
Other  questions  concerned  with  the  availability,  delivery,  price,  and  so  forth  of  reference 
materials  will  receive  prompt  attention  from: 

Office  of  Standard  Reference  Materials 
National   Bureau  of  Standards 
Gaithersburg,   MD  20899 


Stanley   D.  Rasberry 
Chief 

Office  of  Standard  Reference 
Materials 
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ABSTRACT 


This  handbook  was  prepared  to  provide  guidance  for  the  use  of  Standard  Reference  Materials 
(SRM's)  to  provide  an  accuracy  base  for  chemical  measurements.  The  general  concepts  of  preci- 
sion and  accuracy  are  discussed  and  their  realization  by  quality  assurance  of  the  measurement 
process.  General  characteristics  of  SRM's  are  described  and  guidance  is  given  for  their  selec- 
tion for  specific  applications.  Ways  to  effectively  use  SRM's  are  recommended,  utilizing 
control  charts  to  evaluate  and  monitor  measurement  accuracy.  Appendices  provide  statistical 
guidance  on   the   evaluation  of  measurement  uncertainty. 


Key  words:      accuracy;    calibration;    chemical   analysis;    control   charts;    measurement  uncertainty; 

precision;    quality   assurance;    standard  reference  materials;    statistical  control. 
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1  .  Introduction 


Standard  Reference  Materials  (SRM's)  have  become  well  established  as  benchmarks  for  the 
quality  assurance  of  measurements.  Some  analytical  chemists  find  them  i nd i spe ns i b 1 e  for  this 
purpose  and  use  them  systematically;  others  use  them  sporadically.  Yet,  there  are  others  and 
those  especially  in  some  areas  of  analytical  chemistry  who  rarely  if  ever  use  reference  mate- 
rials. From  casual  observation,  the  frequency  of  use  of  reference  materials  is  closely 
coupled  with  one   or  more   of   the   following  factors: 

-  familiarity   with   the   philosophy   of   their  use 

-  degree   of   appreciation   of   the   benefits   of   their  use 

-  availability  of   d i r e c t 1 y- ap p 1 i ca b 1 e  SRM's 

-  understanding  of   the   role  of   directly  and   indirectly  related  SRM's   in   a  measurement 

system 

-  degree  of   full   appreciation   of   measurement   as   a  system. 

The  National  Bureau  of  Standards  has  pioneered  and  continues  to  be  the  leader  in  the 
development  of  Standard  Reference  Materials  (SRM's)  for  quality  assurance  of  measurements. 
The  program  to  provide  such  materials,  originally  known  as  standard  samples,  was  initiated  in 
1906,  largely  in  response  to  needs  of  the  metals  industry.  It  has  since  grown  to  a  multi- 
material  program  of  over  1000  items  that  serves  most  of  the  areas  of  modern  analytical  chem- 
istry. A  large  number  of  materials  useful  in  physical  metrology  and  engineering  are  also 
included.  Some  areas  of  analysis  are  covered  more  completely  than  others,  due  to  historical 
reasons,  priorities  for  national  issues,  and  to  some  extent  the  degree  of  industrial  awareness 
of  the  quality  assurance  concept.  For  the  early  users,  SRM's  were  identical,  in  most 
respects,  to  the  materials  ordinarily  analyzed.  Thus  the  results  of  measurements  of  SRM's 
were  easily  interpreted.  As  the  program  has  grown,  it  has  become  impossible  to  provide  SRM's 
with  a  one-to-one  correspondence  to  every  conceivable  application;  as  a  result,  generic  stan- 
dards are  commonly  produced  which  serve  multi-purposes.  This  concept  broadens  the  scope  of 
applications   and   conserves   effort   and   cost   of  production. 

This  handbook  was  prepared  with  the  objective  to  improve  the  understanding  and  the  basis 
for  use  of  SRM's.  While  written  from  the  viewpoint  of  a  chemist,  the  basic  concepts  described 
are  believed  to  be  applicable  to  most  areas  of  metrology.  The  handbook  is  arranged  in  a  logi- 
cal progression,  starting  with  the  basic  concepts  of  precision  and  accuracy,  followed  by  dis- 
cussions of  the  calibration  and  quality  assurance  of  the  measurement  process,  the  use  of 
Standard  Reference  Materials  to  evaluate  various  kinds  of  measurements,  and  the  reporting  of 
data  with  evaluated  limits  of  uncertainty.  The  statistical  considerations  most  frequently 
applicable  for  the  evaluation  and  interpretation  of  measurement  data  are  reviewed  in  the 
Appendix.  Each  section  is  written  with  some  degree  of  independence  so  that  it  can  be 
understood  without   undue   reference   to   the   contents   of   other  Sections. 

The  treatment  of  each  subject  is  not  claimed  to  be  exhaustive  but  is  often  an  overview. 
Accordingly,  a  list  of  selected  references  is  included  which  contain  both  background  and  other 
information  supplemental  to  that  in  the  text.  A  listing  of  recent  research  papers  from  the 
NBS  Center  for  Analytical  Chemistry,  related  to  the  preparation,  analysis,  and  certification 
of  specific  SRM's   is   also  included. 


SOURCES   OF  INFORMATION 
(See   page   40   for  more  details) 


NBS  Standard  Reference  Materials 
Telephone  Number:      3  01-921-2045 

Mailing  lists  are  maintained  to  keep  SRM  users  updated  with  catalogs  and 
information  on  new  SRM's.     Call   the  number   above   to  have   your   name  added. 


The   free  magazine  American  Laboratory  has   a  monthly  column  called 
"Reference  Materials   to   provide   an   information  forum  for  SRM  users. 


2.     Precision  and  Accuracy 


2.1     Concepts  of  Precision  and  Accuracy 

Accuracy  is  an  intuitively  understandable  and  desirable  requirement  for  most 
measurements.  Data  which  are  knowingly  inaccurate  or  whose  accuracy  is  unknown  have  little 
appeal  to  most  users.  Yet  precision  is  sometimes  confused  with  accuracy  and  the  agreement  of 
successive  results  can   inspire  a  degree  of  confidence  that   the  measurements  may  not  merit. 

Accuracy,  the  closeness  of  a  measured  value  to  the  true  value,  includes  the  concepts  of 
bias  and  precision  (see  figure  1  and  figure  2)  and  is  judged  with  respect  to  the  use  to  be 
made  of  the  data.  A  measurement  process  must  be  unbiased  to  be  capable  of  producing  accurate 
values.  In  such  a  case,  it  must  be  sufficiently  precise,  as  well,  or  else  the  individual 
results  will  be  inaccurate  due  to  unacceptable  variability.  The  following  discussion  is  pre- 
sented to  clarify  these  concepts.  The  term  uncertainty  is  used  widely  in  describing  the 
results  of  measurement  and  denotes  an  estimate  of  the  bounds  of  inaccuracy.  Strictly  speak- 
ing, the  actual  error  of  a  reported  value  is  usually  unknowable.  However,  limits  of  error 
ordinarily  can  be  inferred,  with  some  risk  of  being  incorrect,  from  the  precision  and- 
reasonable  limits   for   the   possible  bias  of   the  measurement  process. 


Figure   1.     Unbiased  Measurement  Processes 

The  distributions  of  results  from  three   unbiased   processes  are  shown. 
The   precision   decreases    in   the   order   A>B>C.     While   the  limiting 
means  of  all  will   approach  the  "true   value,"   process  C   is  relatively 
inaccurate   (compared  with  A)   due  to   its  imprecision. 


Figure  2.     Biased  Measurement  Processes 

All   of   the   processes   are   biased   and   hence   inaccurate   since  the 
limiting  means   do   not   coincide  with   the   "true   value."     However,  it 
will   be  noted  that  most   of   the  results  for   process  A'    will   be  more 
accurate   than   those  of   process   C   and   even   B   (figure   1),    due  to 
precision  considerations. 
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The  concept  of  precision  is  concerned  with  the  variability  of  the  individual  results  of 
replicate  measurements.  A  process  which  shows  a  small  scatter  is  said  to  be  precise  and  vice 
versa.  Obviously  such  judgments  are  subjective  and  based  on  the  intended  use  of  the  data. 
What  might  be  considered  as  very  precise  for  one  purpose  could  be  grossly  imprecise  for 
another.  Random  errors  are  responsible  for  the  observed  scatter  of  measured  values.  These  may 
be  reduced  to  the  point  at  which  they  are  negligible  with  respect  to  the  tolerable  error  of 
the  measured  value,  or  are  limited  by  inherent  characteristics  of  the  instrumentation  or  the 
methodology  used.  The  averages  of  several  series  of  measurements  will  show  a  smaller  vari- 
ability than  the  individual  values  and  the  grand  average  of  such  is  expected  to  approach  a 
limiting  value    (limiting  mean)    as   the  number   of  measurements   is  increased. 

The  concept  of  bias  is  concerned  with  whether  or  not  the  limiting  mean  differs  from  the 
true  (or  accepted)  value  of  the  property  measured.  Here  again,  judgment  is  ordinarily 
involved  since  it  is  impossible  to  eliminate  all  error  or  even  to  know  if  this  has  been 
achieved.  Such  decisions  are  thus  based  on  whether  or  not  bias  exists  for  all  practical 
pur  poses  . 

In  the  case  of  individual  measurements,  each  will  exhibit  some  degree  of  inaccuracy,  that 
is  to  say  it  will  deviate  from  the  true  value.  This  will  occur  because  of  random  error 
together  with  any  bias  of  the  measurement  system.  Indeed,  it  is  highly  improbable  that  any 
individual  measurement  made  by  an  unbiased  measurement  system  will  be  accurate,  since  the 
probability  of  zero  random  error  is  zero.  Many  individual  values  may  appear  to  have  the 
correct  value  but  this  is  due  to  truncation  resulting  from  i nsensi ti vity  of  the  measurement 
process   or   from   rounding  of   the  data. 

A  measurement  process  should  be  sufficiently  precise  to  minimize  the  number  of  replicate 
measurements  required  for  the  intended  use.  A  very  precise  system  may  need  only  a  few  mea- 
surements, even  one,  to  provide  data  that  would  not  be  significantly  improved  by  further 
replication.  Also,  a  measurement  system  must  be  sufficiently  precise  to  identify  whether  or 
not  biases  of  a  comparable  magnitude  are  present  in  the  system.  While  possible  in  principle, 
an  unbiased  measurement  process  of  low  precision  may  be  incapable  of  providing  accurate  data, 
from  a  practical  point  of  view,  because  of  the  large  number  of  measurements  required  to  reduce 
the  uncertainty  of  the  random  error  to  reasonable  limits. 

Precision  may  be  evaluated  by  the  redundant  process  of  replicate  measurement.  Results  on 
a  single  material  may  be  used  for  this  purpose,  or  the  information  obtained  on  a  number  of 
samples  (even  duplicate  measurements,  see  Appendix  C.2.2)  may  be  pooled.  Accordingly,  there 
is  no  reason  why  a  laboratory  cannot  evaluate  its  own  precision  without  external  assistance 
[1].     While  SRM's  may   be   helpful   in   this   regard,    they  are   not   necessary   for   this  purpose. 

In  order  to  properly  estimate  precision,  a  number  of  measurements  over  an  extended  period 
of  time  may  be  required.  A  small  number  of  measurements  tend  to  underestimate  the  standard 
deviation  since  small  random  errors  are  more  probable  than  large  ones  and  less  likely  to  be 
observed  during  a  limited  set  of  measurements.  Also,  it  is  common  experience  that  it  is  much 
easier  to  repeat  a  measurement  than  to  reproduce  it  over  a  period  of  time.  The  r e pe a t i b i 1 i t y , 
or  short-term  standard  deviation  is  needed  to  answer  questions  about  the  number  of  repetitive 
measurements,  that  may  be  required  while  the  long-term  standard  deviation,  or  reproducibility 
is  needed  to  answer  such  questions  as  the  agreement  of  data  obtained  at  different  times,  or 
the  statistical   control   of  a  measurement  process. 

Though  precise  measurements  can  serve  useful  purposes  when  limited  comparisons  are 
required,  accuracy  is  more  often  an  essential  requirement.  Whenever  the  true  value  of  the 
measured  quantity  is  needed,  or  when  data  from  different  laboratories,  different  methodo- 
logies, or  that  from  the  same  laboratory  using  the  same  method  over  a  period  of  time  needs  to 
be  interrelated,  bias  can  be  a  serious  problem.  The  analysis  of  appropriate  reference  mate- 
rials is  the  best  and  easiest  way  to  investigate  bias.  While  methods  may  be  compared  with 
reference  methods  to  assess  accuracy,  this  is  ordinarily  a  more  difficult  and  time-consuming 
process    (see  Appendix   D  .  k  )  . 

2.1.1     Precision  and  Bias   in  a  Measurement  System 

The  precision  of  a  measurement  system  may  be  influenced  by  a  number  of  factors,  each 
having  its  own  precision.  The  precision  of  each  factor,  quantified  in  terms  of  the  variance, 
contributes  to  the  precision  of  the  process.  The  variance  is  simply  the  square  of  the  stan- 
dard deviation,  s.  In  measurement  processes,  the  variances  of  the  individual  steps,  sf  ,  add 
up  to  define  the  variance  of  the  process,  i.e.,  s 2  =  s \  +  s \  +  s 3  +  . .  +  s^.  Some  of  the 
steps  (or  factors)  can  be  easily  identified  and  the  individual  variances  estimated.  Examples 
are  weighing  and  extraction.  As  steps  are  identifiable,  improvements  conceiveably  can  be  made 
when  there  are  "assignable  causes"  for  undesirable  imprecision.  Because  of  addition  in  quad- 
rature, it  is  evident  that  one  or  a  few  sources  of  variance  can  be  the  major  contributors  to 
the  total  variance.  Knowledge  of  the  magnitude  of  the  individual  variances  can  indicate  both 
directions  for  improvement  and  possible  sources  of  trouble  when  " ou t - of - co n t r ol "  measurements 
occur . 
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It  is  conceivable  that  variance  can  be  reduced  to  very  low  levels,  with  diligent  effort. 
Laboratories  commonly  improve  their  precision  as  they  gain  experience  with  their  methodology. 
Ordinarily,  a  laboratory  will  improve  its  quality  control  practices  to  the  point  where  the 
precision  attained  is  adequate  for  a  particular  application  or  when  peer  performance  has  been 
attained.  Because  analysis  must  be  pragmatic,  cost-benefit  decisions  will  often  dictate  how 
far  to  go.  For  example,  it  is  a  matter  of  record  that  laboratories  using  the  same  methodology 
will  differ  in  their  precisions.  This  may  be  due  to  difference  in  levels  of  skill  but  also  to 
different   levels   of   tolerance   for   permissible  error. 

Bias  in  measurement  systems  can  result  from  several  sources.  The  commonly  recognized 
ones  are:  control  of  measurement  variables;  interferences;  erroneous  calibration;  contamina- 
tion; losses;  deteriorations;  inefficiencies  in  extractions  or  sample  dissolution.  Variabi- 
lity in  some  of  these  can  contribute  to  random  error  as  well,  and  often  to  a  major  degree. 
Inappropriate  calibration  techniques  can  be  a  serious  source  of  bias.  Reliance  on  spiking 
which  may  not  simulate  ma t r i x - i n cor p or  a t e d  analyte,  or  the  use  of  a  pure  matrix  (e.g.,  pure 
water)  to  simulate  a  natural  matrix  (waste  water)  are  typical  examples.  A  reference  material 
that   closely   simulates   the   analytical   samples   is   needed   to   identify   and   evaluate   such  biases. 

Unlike  random  errors,  systematic  errors  or  biases  from  several  sources  are  not 
necessarily  randomly  distributed;  hence  one  must  consider  that  biases  can  add  up 
algebraically.  That  is  to  say,  the  total  bias  B  =  B1  +  B2  +  •  •  •  +  B  .  Thus,  a  large  number 
of  small  biases  can  equal  or  even  exceed  a  large  bias  from  a  single  source.  While  the_effect 
of  random  error  decreases  as  the  number  of  measurements,  n,  is  increased  (s-  =  sx//n),  the 
effect   of   bias   is   independent   of   the   number   of  measurements. 

2.2     Dependence  on  Standards 

All  measurements  depend  on  standards.  Physical  measurements  depend  almost  entirely  on 
physical  standards  with  little  or  no  dependence  on  chemical  standards.  Chemical  measurements 
on  the  other  hand  depend  on  both  with  greater  dependence  on  the  latter.  The  early  recognition 
of  the  need  for  universally  acceptable  physical  standards,  and  the  chaos  that  could  result 
from  their  unavailability  led  to  the  development  of  the  now  universally  accepted  physical 
standards  for  the  primary  units  of  length,  mass,  time,  temperature,  and  radiant  luminosity  and 
the  units  derived  from  them  such  as  pressure,  force,  acceleration,  power,  and  density.  No 
corresponding  chemical  standards  have  ever  been  developed.  There  are,  of  course,  the  useful 
standards   of   atomic   weights   and  a   variety   of   physi cochemi cal  standards. 

The  early  chemical  analytical  measurements  were  largely  absolute  in  nature,  which  means 
that  they  depended  almost  entirely  on  physical  standards.  Thus,  classical  analysts  used  gravi- 
metry  in  which  chemical  constituents  were  removed  quantitatively  or  isolated  from  their 
matrix,  purified,  and  weighed.  Relation  of  such  masses  to  the  chemical  information  desired  was 
calculated  by  stoichiometry .  The  critical  sources  of  error  in  such  measurements  were  incom- 
plete separations,  mechanical  losses,  and  contamination  due  to  co pr e c i p i t a t i on  and  analytical 
blanks.  Physical  standards  were  the  primary  standards  and  provided  adequate  and  sufficient 
means   to   control    the   accuracy   of   such   chemical  measurements. 

While  classical  methods  augmented  by  such  ph y s i ca 1 - chem i ca 1  techniques  as  coulometry, 
still  provide  the  basis  for  the  most  accurate  measurement  of  major  constituents  or  for  the 
assay  of  pure  materials,  the  bulk  of  modern  chemical  measurements  are  made  by  comparative 
techniques  in  which,  in  essence,  an  instrument  is  used  to  compare  an  unknown  sample  with  one 
of  known  composition.  Some  measurements  require  the  removal  of  the  substance  of  interest 
prior  to  analysis,  or  its  isolation  from  the  matrix  using  physical  or  chemical  techniques.  In 
others,  the  analytical  process  may  combine  the  separation  and  measurement  steps.  Separation 
of   a   group   of   analytes,    followed   by   selective   detection   is   another   approach   to  analysis. 

The  trend  toward  comparative  measurements  has  shifted  the  need  in  chemical  analysis  from 
heavy  dependence  on  physical  standards  to  heavy  dependence  on  chemical  standards.  However, 
there  is  usually  no  problem  in  obtaining  chemicals  of  requisite  purity  to  serve  as  chemical 
standards  so  that  essentially  no  national  or  i n t er na t Lanal— "^t andar d  chemicals  have  been  devel- 
oped or  exist  today,  in  the  same  sense  as  the  physical  standards.  When  reference  materials 
exist,  they  are  ordinarily  not  chemical  standards  in  the  hierarchal  sense  of  physical  stan- 
dards of  measurement  but  rather  are  quality  assurance  materials  as  will  be  discussed  later. 
Of  course,  some  reference  materials  are  high  purity  chemicals  which  may  be  used  as  primary 
standards    in   some   areas   of   chemical  analysis. 

2.3     Physical   and   Chemical  Standards 

Seven  basic  units  for  physical  measurements  have  been  ado-pted  by  international  agreement. 
From  these,  all  other  units  of  measurement  may  be  derived  (2).  The  basic  units  are  defined  by 
appropriate  artifacts  or  measurements.  Transfer  standards  may  be  calibrated  with  respect  to 
the  basic  standards  maintained  in  national  laboratories.  Such  calibrations  must  be  done  with 
a  sufficient  degree  of  reliability,  traceable  to  national  standards.  For  most  chemical 
measurements,  uncertainties  in  the  physical  standards  used  do  not  contribute  significantly  to 
the   analytical  uncertainty. 
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Chemical  standards  differ  from  physical  standards  in  several  ways.  They  are  chemical 
elements  or  compounds,  usually  identical  with  or  related  by  stoi chi ometry  ,  to  the  analytes 
measured.  It  is  ordinarily  possible  to  obtain  such  standards  in  sufficient  purity  or  to 
purify  them  adequately  so  there  is  no  need  to  maintain  them  in  a  national  laboratory.  Due  to 
the  complexity  and  variety  of  chemical  measurements  it  would  be  infeasible  if  not  impossible 
to   do  so . 

The  problem  in  the  use  of  chemical  standards  is  the  degree  to  which  they  can  be  blended 
or  incorporated  into  a  sample  matrix  to  produce  a  substance  that  can  reliably  calibrate  or 
define  the  response  function  of  a  chemical  analyzer.  Matrix  match  between  standard  and  analy- 
tical sample  is  often  critical  but  difficult  to  achieve.  When  standards  are  carried  through 
an  entire  analytical  process,  spikes,  surrogates,  and  other  artificially  introduced  consti- 
tuents may  not  respond  in  the  same  manner  as  naturally  occurring  analytes,  thus  causing  cali- 
bration problems.  On  the  other  hand,  standards  prepared  to  simulate  the  final  analytical 
sample  (e.g.,  an  extract  or  a  solution  of  the  original  sample)  may  not  calibrate  the  entire 
analytical  process. 

No  matter  what  kind  of  standards  are  used,  they  must  be  prepared  with  care  from  reliable 
starting  materials.  The  mode  of  preparation  should  be  such  that  the  uncertainty  of  the  stan- 
dards does  not  contribute  significantly  to  the  overall  analytical  uncertainty.  Chemists 
ordinarily  assume  that  standards  can  be  prepared  with  negligible  error.  Standards  for  very 
low  concentration  levels  may  be  exceptions.  Furthermore,  the  stability  of  such  standards  and 
the  degree  of  protection  required  to  safeguard  them  from  contamination,  deterioration,  or 
losses   always   needs  consideration. 

Standards  should  never  be  used  in  an  e x t r a po 1  a t i ve  mode.  They  should  always  bracket  the 
measurement  range.  No  measurement  should  be  reported  at  a  value  lower  or  higher  than  the 
lowest   or   highest   standard   used   to   calibrate   the   measurement  process. 

2.H     Calibration,   Standardization,   and  Analytical  Response  Function 

Calibration  may  be  defined  as  the  comparison  of  a  measurement  standard  or  instrument  with 
another  standard  or  instrument  to  identify  or  eliminate  by  adjustment  any  variation  (devia- 
tion) of  the  accuracy  of  the  item  being  compared.  Physical  standards  such  as  masses,  and 
instruments  such  as  thermometers  are  calibrated.  Physical  standards  or  calibrated  instru- 
ments, traceable  to  national  standards  are  required  for  calibration  of  other  standards  or 
instruments.  The  uncertainty  of  the  calibrations  will  depend  on  the  uncertainty  of  the  values 
of   the  standards   used   and   the  measurement   processes   used   for   the   i n t er com  par i s ons  . 

Chemical  measurements  require  standards  consisting  of  pure  chemicals  or  liquid,  solid,  or 
gaseous  mixtures  prepared  from  them.  For  most  applications,  chemicals  of  sufficient  purity 
for  use  as  standards  or  for  their  preparation,  can  be  obtained  from  suppliers.  For  critical 
applications,  chemical  standards  are  sometimes  assayed  to  determine  their  purity  or  analyzed 
for  impurities  in  order  to  calculate  their  composition.  The  latter  practice  can  be  erroneous 
unless  it  is  ascertained  that  all  significant  impurities  have  been  determined.  The  concentra- 
tions of  solutions  used  as  analytical  reagents  or  as  calibrants  are  sometimes  defined  on  the 
basis  of  their  pr e par  a t i on al  data  and  knowledge  or  assumptions  of  purity.  When  concentrations 
are  determined  by  comparison  with  other  solutions  of  known  concentrations,  the  process  is 
called  standardization. 

In  general,  the  calibration  of  a  chemical  analyzer  consists  in  the  evaluation  of  its 
response  function,  in  terms  of  chemical  composition  of  the  samples  to  be  analyzed.  The  anal- 
yzer responds  to  some  property  of  the  analyte,  the  value  of  which  needs  to  be  quantified  by 
use  of  known  substances.  Then  it  is  tacitly  assumed  that  the  instrument  will  respond  analog- 
ously to  the  standard  and  test  samples.  The  sources  of  uncertainty  in  this  case  are  the 
uncertainty   in   composition   of   the   known  samples   and   the   validity   of   the  analogy. 

'it  is  a  generally  accepted  principle  of  reliable  analysis  that  chemical  analyzers  should 
be  calibrated  over  the  full  range  of  measurement  and  that  measurement  data  be  restricted  to 
the  range  calibrated.  It  is  not  good  measurement  practice  to  report  extrapolated  data,  i.e., 
data  outside  of  the  range  calibrated.  The  range  of  reliable  calibration  can  be  considered  as 
the  range   of   reliable  measurement   and  conversely. 

The  necessary  frequency  of  re-calibration  or  re-evaluation  of  a  response  function  will 
depend  on  the  stability  of  the  measurement  system  and  the  accuracy  requirements  for  the  data. 
To  ensure  confidence  in  measured  values,  such  re-evaluations  should  be  made  before  significant 
changes   are   to   be  expected. 

The  terms  primary  and  secondary  standards  are  used  frequently  and  need  some  discussion. 
Strictly  speaking,  a  primary  standard  is  one  whose  value  may  be  accepted  without  further  veri- 
fication by  the  user.  It,  in  turn,  may  be  used  to  establish  or  ascertain  a  value  for  a  secon- 
dary standard.  Thus,  a  secondary  standard  provided  by  one  laboratory  (for  example,  a  mass 
standard  calibrated  at  NBS)  could  serve  as  a  primary  standard  for  another  laboratory.  In  any 
case,  the  uncertainty  of  the  value  of  any  standard  must  be  known  since  the  adjective 
designation    (primary  or   secondary)    does   not   define   any   limits   of   uncertainty   for   its  value. 
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Analytical  chemists  have  used  the  terms  primary  and  secondary  to  indicate  relatively  pure 
materials  that  may  be  used  to  prepare  solutions  with  accurately  known  compositions  (3).  The 
International  Union  of  Pure  and  Applied  Chemistry  CO  has  set  minimum  levels  of  purity  for 
primary   and   secondary  chemicals. 

Specifications  for  certain  classes  of  physical  standards,  have  been  established  which 
include  design  characteristics  and  permissible  departures  (tolerances)  from  nominal  values. 
Thus  a  1  gram  weight  of  Class  1  will  have  a  tolerance  of  0.03^  mg  while  the  tolerance  is  2  mg 
for  a  Class  6  weight  of  the  same  denomination  (5).  The  nominal  weight  is  assumed,  when  used, 
but  it  should  be  recognized  that  the  true  value  may  lie  anywhere  within  the  tolerance  range. 
If  such  an  uncertainty  is  too  large,  the  standard  may  be  calibrated,  but  upgrading  may  be 
difficult   due   to   design  considerations. 

2.5     National  Measurement  System  for  Analytical  Chemistry 

What  might  be  called  the  National  Measurement  System  for  Analytical  Chemistry  is  shown  in 
Figure  3.  The  measurement  of  any  specific  sample  requires  a  measurement  system,  individually 
designed  with  consideration  of  the  requirements  for  the  data.  This  system  must  be  calibrated, 
using  physical  and  chemical  standards.  As  already  discussed,  the  physical  standards  may  be 
traceable  to  national  primary  standards  maintained  by  the  National  Bureau  of  Standards  and 
compatible  with  those  of  other  nations.  The  chemical  standards,  generally,  will  be  prepared 
by    the    measurement    laboratory.  Ordinarily,     they    serve    as    quality    assurance    materials  to 

evaluate  measurement  accuracy,  to  i n t er ca 1 i br a t e  laboratories  in  a  measurement  program,  and  to 
provide   compatibility   of   measurement   data.      SRM's   can  serve   as   calibrants   in   some  cases. 
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Figure   3.      National   Measurement   System   for   Analytical  Chemistry 

The  figure  illustrates  a  number  of  points  that  have  been  discussed  earlier.  The  critical 
dependence  of  modern  analysis  on  both  physical  and  chemical  standards  is  indicated  although 
the  former  of  requisite  reliability  usually  are  available  to  the  analyst  from  external 
sources.  Ordinarily,  the  chemist  must  prepare  all  chemical  standards  used,  starting  with 
source  materials  as  indicated.  Questions  about  the  matrix  match  of  standards  and  test  samples 
always  must  be  considered.  The  measurement  process  is  highly  dependent  on  broad  areas  of 
science   and   technology,    as  well. 
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Quality  assurance  of  the  measurement  process  is  essential  for  reliable  data.  The 
important  role  of  SRM's  in  controlling  the  calibration  process  and  in  assessing  data  quality 
is  shown  and  will  be  elaborated  on  throughout   this  handbook. 
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3.     Quality  Assurance 


Quality  assurance  is  the  name  given  to  the  procedures  used  to  ascertain  that  measurement 
data  are  good  enough  for  their  intended  purpose  (1,6).  It  involves  two  distinct  but  related 
activities: 

quality  control  -  those  procedures  and  activities  developed  and  implemented  to  produce  a 
measurement   of   requisite  quality 

quality  assessment  -  those  procedures  and  activities  utilized  to  verify  that  the  quality 
control  system  is  operating  within  acceptable  limits  and  to  evaluate  the  quality  of  the 
data . 

The  basic  requirements  for  producing  reliable  data  are  appropriate  methodology, 
adequately  calibrated,  and  properly  used.  These,  together  with  good  laboratory  and  measure- 
ment practices,  are  the  basic  ingredients  of  a  quality  control  program.  The  quality  of  the 
data  may  be  assessed  by  use  of  reference  materials  to  evaluate  bias  and  the  time-consuming 
process   of   redundancy   to   evaluate  precision. 

3.1     Quality  Assurance  of  a  Measurement  Process 

There  is  a  growing  awareness  that  analytical  data  for  use  in  any  decision  process  must  be 
technically  sound  and  defensible.  Limits  of  uncertainty  are  required  which  need  to  be  sup- 
ported by  suitable  documentary  evidence.  Professional  analytical  chemists  have  always 
espoused  this  philosophy.  Regulatory  agencies  and  contracting  parties  increasingly  are  spe- 
cifying it  as  a  routine  requirement.  The  formal  and  even  informal  procedures  used  to  estab- 
lish the  limits  of  uncertainty  of  measurement  data  are  generally  referred  to  as  quality 
assurance,  in  which  replicate  measurements  and  independent  procedures  support  claims  for  the 
accuracy  of  the  data.  When  a  measurement  process  can  be  established  and  demonstrated  to  be  in 
a  state  of  statistical  control,  the  accuracy  of  the  process  can  be  imputed  to  characterize 
the  accuracy  of  all  data  produced  by  it.  Hence  the  requirements  for  redundancy  can  be  greatly 
reduced . 

A  measurement  process  of  the  type  described  above  is  illustrated  in  Figure  4.  It 
utilizes  methodology  appropriate  for  the  measurement  program  and  appropriate  quality  control 
practices  are  followed.  Statistical  control  is  demonstrated  by  the  measurement  of  replicate 
samples  and  internal  reference  materials,  using  control  charts.  This  also  permits  the 
evaluation   of   the   precision   of   the  process. 


Accept/ 
Use 


Figure   4.      Measurement   Process   Quality  Assurance 


When  the  process  is  demonstrated  to  be  in  a  state  of  statistical  control,  reference 
materials  such  as  SRM's  may  be  analyzed  to  assess  measurement  accuracy.  The  resulting  judg- 
ment of  precision  and  accuracy  can  be  assigned  to  the  sample  data  output  of  the  process.  The 
figure  shows  also  how  data  quality  assessment  is  used  in  a  feedback  mode  to  monitor  the  pro- 
cess, to  initiate  corrective  actions  as  required,  and  in  a  decision  mode  for  the  release  or 
use   of   data  . 

3.2     Statistical  Control 

A  stable  measurement  system  is  expected  to  produce  reproducible  data.  Statistical  control 
may    be    defined    as    the    attainment    of    a   state    of    predictability.      Under    such    a    condition,  the 
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mean  of  a  large  number  of  measurements  will  approach  a  limiting  value  (limiting  mean)  and  the 
individual  measurements  should  have  a  stable  distribution,  described  by  their  standard  devia- 
tion. Under  such  a  condition,  the  limits  within  which  any  new  measured  value  would  be  expected 
to  lie  can  be  predicted  with  a  specified  probability,  the  confidence  limits  for  a  measurement 
or  mean  of  set  of  measurements  can  be  calculated,  and  the  number  of  measurements  required  to 
obtain  a  mean   value  with  a   given   confidence  may   be  estimated. 

It  is  axiomatic  that  attainment  of  statistical  control  is  the  first  objective  of  a 
measurement  process.  This  is  just  another  way  of  stating  that  it  must  achieve  stability. 
Yet,  it  has  the  further  connotation  that  the  data  produced  are  statistically  describable. 
Eisenhart  has  stated  --  "Until  a  measurement  operation  has  been  'debugged1  to  the  extent  that 
it  has  attained  a  state  of  statistical  control  it  cannot  be  regarded  in  any  logical  sense  as 
measuring  anything  at   all  (7)." 

When  a  measurement  system  is  altered  or  disturbed,  a  new  or  modified  measurement  system 
may  result  with  a  limiting  mean  and/or  a  standard  deviation  different  from  the  previous 
values.  During  normal  use  of  a  measurement  system,  changes  can  occur  as  well,  unbeknown  to 
the  laboratory  personnel.  A  well  designed  quality  assurance  program  will  monitor  the  system 
for   such   changes   and   indicate  when   corrective   actions   are  required. 


3.3     Control  Charts 


The  philosophy  of  the  use  of  control  charts  is  based  on  the  premise  that  analytical 
measurements  may  be  systematized  to  provide  a  process  simulating  a  manufacturing  process  in 
many  respects.  As  the  result  of  quality  control  procedures,  a  system  may  be  debugged  and 
attain  a  state  of  statistical  control  of  its  data  output.  The  accuracy  of  the  system  can  be 
evaluated  for  typical  test  samples  and  thus  can  be  assigned  to  all  similar  measurement  data 
generated   by   the  system. 

A  control  chart  is  simply  a  graphical  way  to  interpret  test  data.  In  its  simplest  form, 
a  selected  reference  sample  is  measured  periodically  and  the  results  are  plotted  sequentially 
(or  time-ordered)  on  a  graph.  Limits  for  acceptable  values  are  defined  and  the  measurement 
system  is  assumed  to  be  in  control  (variability  is  stable  and  due  to  chance  alone)  as  long  as 
the  results  stay  within  these  limits.  A  second  useful  form  of  control  chart  is  one  in  which 
the  standard  deviation  or  range  (even  differences  between  duplicates)  of  a  series  of  measure- 
ments is  plotted  in  a  similar  manner.  The  residence  of  the  values  within  expected  limits  is 
accepted  as  evidence  that  the  precision  of  measurement  remains  in  control.  The  monitored 
precision  of  measurement  and  the  accuracy  of  measurement  of  the  reference  sample  may  be 
transferred,  by  inference,  to  all  other  appropriate  measurements  made  by  the  system  while  it 
is   in   a  state  of  control. 


Examples  of  each  kind  of  control  chart  described  above  are  given  in  Figure  5.  In  Figure 
5A,  the  mean,  x,_of  2  measurements  is  plotted  sequentially.  The  central  line  is  the  most  pro- 
bable value  for  x  (i.e.,  the  grand  average,  x,  of  measurements  of  x)  and  the  limits  LWL  to  UWL 
(lower  and  upper  warning  limits)  define  the  area  in  which  95  percent  of  the  plotted  points  are 
expected  to  lie.  The  limits  LCL  to  UCL  (lower  and  upper  control  limits)  define  the  area  in 
which  almost  all  (99.7$)  of  the  plotted  points  are  expected  to  lie  when  the  system  is  in  a 
state  of  statistical  control.  It  should  be  clear  that  when  more  than  5  percent  of  the  points 
lie  outside  of  the  warning  limits  or  when  values  fall  outside  of  the  control  limits,  the  sys- 
tem is  behaving  unexpectedly  and  corrective  actions,  and  even  rejection  of  data,  may  be 
required . 

A  discussion  of  the  strategy  to  follow  in  the  use  of  control  charts  is  beyond  the  scope 
of  the  present  presentation  but  laboratories  using  them  need  to  develop  such.  Results  are 
expected  to  scatter  randomly  within  the  limits.  Systematic  trends  or  patterns  in  the  data 
plots  may  be  early  warning  of  incipient  problems  and  are  cause  for  concern,  hence  techniques 
to   identify  such   should   be  practiced. 

Control  charts,  including  the  factors  for  calculating  control  limits  are  discussed  more 
thoroughly  elsewhere,  of  which  ASTM  Special  Technical  Publication  STP  15D  is  an  excellent 
source  of  information  [8].  Briefly,  the  central  line  is  either  the  known  value  for  the  test 
sample  (e.g.,  certified  value  if  an  SRM  is  used),  or  the  mean  of  at  least  15  sets  of  indepen- 
dent measurements.  The  standard  deviation  estimate,  s,  should  be  based  on  at  least  15  such 
measurements.     Control   limits   can   then   be   calculated   according   to   the   following  table. 


9 


A 


Sequence 


Sequence 


Figure   5.      Duplicate  measurements   made   on  SRM   122G.      A   (upper)    -   x   chart;    B   (lower)    -  R 
control  chart 


Control  Limits1 


Central  Line 


X    (mean  of   =   15   sets   of  measurements) 


*For   a  more   extensive   treatment   of   control   limits,    see  Ref. 


For  the  above  limits,  n  represents  the  number  of  repetitive  measurements  of  the  reference 
sample,  the  mean  of  which  will  be  plotted  on  the  X  chart.  For  an  X  chart  (single  measurement 
of   the   reference   sample)    n   =  1. 


1  0 


Figure  5B  represents  a  range  (R)  control  chart.  In  chemical  measurements ,  the  difference 
of  duplicates  is  a  good  choice  to  plot  on  such  a  chart.  The  line  R  represents  the  average 
range  obtained  as  the  result  of  a  reasonably  large  number  (e.g.,  >15)  sets  of  duplicate  mea- 
surements. The  warning  and  control  limits  are  appropriate  multiples  of  R  and  have  the  same 
significance  as  discussed  in  Figure  5A.  The  range  is  calculated  without  regard  to  sign  (abso- 
lute  value)    so   the   lower   limits   are  zero. 

The  factors  for  calculating  limits  are  discussed  in  the  reference  [8].  For  duplicate 
measurements,    they  are 

R  mean   of   the   differences   of   >15   sets   of   duplicate  measurements 

UWL  2.512  R 

UCL  3.267  R 

LWL  0 

LCL  0 

In  use,  the  differences  of  duplicate  determinations  are  plotted  on  the  control  chart  and 
statistical  control  is  assumed  as  long  as  they  fall  within  the  expected  limits.  Again  they 
should  not  fall  disproportionately  outside  of  the  warning  limits  and  trends  should  not  be 
observed  . 

The  R  chart  is  based  on  the  known  relation  between  the  range  and  the  standard  deviation, 
hence  it  is  a  form  of  standard  deviation  or  precision  chart.  When  used  with  an  X  or  X  chart, 
it  is  useful  when  deciding  whether  an  observed  deviation  is  due  to  bias  or  to  a  change  in 
precision.     When   used   alone,    an   R   chart   will   monitor   precision    (but   not  bias). 

The  test  samples,  themselves,  when  measured  in  duplicate,  may  be  control  charted  to 
monitor  precision.  The  ranges  for  duplicate  measurements  of  a  class  of  samples  can  be  plotted 
on  the  same  control  chart,  as  long  as  they  are  expected  to  be  measurable  with  comparable 
precision. 

An  s  control  chart  is  based  on  plotting  the  estimate  of  the  standard  deviation  obtained 
from  measurments  of  n  replicates  of  the  reference  sample.  Since  a  number  of  measurements  are 
required  (at  least  seven  is  recommended)  to  estimate  the  standard  deviation  each  time  it  is 
charted,  with  any  degree  of  reliability,  and  since  some  calculations  are  required,  such  charts 
are  seldomly  used  in  chemical  measurements.  R  charts  can  provide  sufficient  monitoring  of 
precision  with  a  reasonable  amount  of  effort.  They  also  offer  the  advantage  of  using  a  rea- 
sonable number  of  the  actual  test  samples  to  monitor  precision.  The  use  of  an  R  chart  for 
test   samples   and   an  X   chart   utilizing  an  SRM  may   be   ideal   choices   for   many  laboratories. 

An  X  control  chart  is  more  robust  than  an  X  chart.  Since  it  is  based  on  the  mean  of  two 
or  more  measurements,  an  occasional  outlier  will  have  limited  influence  on  the  decision  pro- 
cess. However,  it  requires  additional  work  and  this  should  be  considered  when  using  such  in  a 
quality  assessment  program.  If  the  assessment  strategy  calls  for  confirmatory  measurements  of 
reference  samples  when  out-of-control  is  indicated,  the  advantage  of  an  X  chart  is  lessened. 
Such  a  strategy  is  most  effective  when  control  charts  are  maintained  and  used  in  a  real-time 
mode  which  will  also  provide  the  advantage  of  ability  to  take  immediate  corrective  actions  and 
thus  minimize   the   uncertainty   of  data. 

The  question  of  how  to  obtain  the  statistical  information  necessary  to  construct  a 
control  chart  needs  to  be  considered.  Once  the  decision  to  develop  a  control  chart  is  made, 
one  might  want  to  acquire  the  standard  deviation  and  mean  data  as  quickly  as  possible,  but 
this  could  be  misleading.  It  has  been  mentioned  that  measurements  made  over  a  short  period  of 
time  show  greater  consistency  than  those  obtained  over  a  long  period  of  time.  Since  the  con- 
trol chart  will  be  used  over  a  period  of  time,  the  latter  is  more  appropriate  for  judging 
performance . 

To  develop  control  limits  based  on  long-term  behavior,  it  is  recommended  that  at  least  15 
data  points  be  accumulated  and  that  no  two  points  be  obtained  on  the  same  day.  This  recommen- 
dation applies  only  to  obtaining  the  standard  deviation  data  to  establish  control  limits  for 
an   X  or   X   chart   and   for   preparing  an   s   or   R   control  chart. 

If  an  SRM  is  used  as  the  control  chart  reference  sample,  the  value  for  the  central  line 
is  known,  namely  the  certified  value.  If  a  laboratory's  own  internal  reference  materials  are 
used,  much  work  may  be  required  if  the  "true  value"  for  the  central  line  is  to  be  used.  Some 
laboratories  use  an  analytical  mean  value  for  the  central  line,  and  assume  that  this  is 
essentially  the  "true  value".  This  may  be  true  only  if  the  measurement  process  has  been 
demonstrated  to  have  negligible  bias.  The  use  of  an  analytical  mean  as  the  value  for  the 
central  line  can  be  useful  in  indicating  stability  of  a  process  but  bias  can  be  evaluated  only 
when   true  values   are  known. 
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3.1     Frequency  of  Use  of  Reference  Materials 


The  optimum  frequency  of  use  of  reference  samples  and  also  of  replicates  of  actual  test 
samples  will  depend  on  the  stability  of  the  measurement  system  and  the  risk  involved  when  the 
system  departs  from  statistical  control  (9).  Since  all  data  obtained  during  the  period  last- 
known-  i n-control  to  f irst-known-out-of-control  are  suspect,  such  intervals  may  need  to  be 
minimized.  The  real-time  use  of  control  charts  and/or  reference  material  data  is  a  further 
consideration.  While  the  following  discussion  is  directed  toward  control  chart  maintenance, 
the  same  philosophy  applies,  whether  this  is  done  or  whether  the  results  on  reference  samples 
are   interpreted  by  other  means. 


There  are  several  empirical  approaches  to  deciding  on  the  frequency  of  use  of  reference 
samples.  The  experience  of  the  laboratory  may  indicate  the  expected  frequency  of  occurrence 
of  trouble,  in  which  case  reference  sample  measurements,  at  least  three  in  number,  should  be 
equally  spaced  within  such  an  interval.  Another  approach  is  the  "length  of  run"  concept.  In 
this,  recognizable  breaks  in  the  production  (of  data)  process  are  identified  which  could  cause 
significant  changes  in  precision  or  bias.  Such  breaks  could  include  change  of  work  shift; 
rest  periods;  change,  modification,  or  adjustment  of  apparatus;  use  of  new  calibration  stan- 
dards; significantly  long  down-times;  use  of  a  new  lot  of  reagents.  At  least  three  reference 
samples  should  be  measured  during  any  of  these  periods  when  the  periods  are  considered  to  be 
potentially  significant. 

In  summary,  the  measurement  of  reference  materials  is  a  risk-reducing  procedure. 
However,  if  it  involves  more  than  10  percent  of  a  laboratory's  measurement  effort,  either  the 
quality  control  process  may  need  improvement  or  too  much  effort  is  being  exerted  in  this 
direction.  If  less  than  5  percent  of  effort  is  devoted  to  such  measurements,  the  laboratory 
may  be  taking  too  high  a  risk  of  producing  unacceptable  data,  or  may  not  even  know  the  quality 
of  the  data  it  is  producing.  The  above  statements  are  made  with  a  laboratory  making  a  signi- 
ficant number  of  high-quality  routine  measurements  in  mind.  If  a  laboratory's  program 
involves  occasional  or  one-of-a-kind  measurements,  the  amount  of  quality  assurance  effort 
required,  including  the  number  of  measurements  of  reference  materials  to  be  made  may  be 
significantly  more  than  that   indicated  above. 

Suggested  measurement  schedules  for  efficient  utilization  of  reference  materials  are 
given  in  Table  1  and  Table  2  (9).  The  sequence  in  Table  1  utilizes  a  combination  of  an 
internal  reference  material  (IRM)  and  a  SRM.  The  sequence  in  Table  2  utilizes  a  limited  number 
of  duplicate  or  split  samples  together  with  reference  materials.  In  either  case,  the  use  of 
control  charts  is  recommended  on  a  real-time  basis.  Recommended  critical  decision  points  in 
the  measurement  sequences  are  also  indicated. 


Table   1.      Quality   Assessment   Using  RM's 


Daily/Event  Schedule 


Daily/Event  Schedule 

All BRATION  ■  FULL  EXPECTED  RANGE 
RM 

EST  SAMPLES  ■  CROUP  1 
RM 

EST  SAMPLES  -  GROUP  2 


EST  SAMPLES 


AL I  BRAT  ION   ■   MI DRANGE  POINT 

NOTES 
•  DECISION 
MAINTAIN  C0NTR0I 
X- CONTROL  CHART 
R ■ CONTROL  CHART 
SYSTEM  MUST  BE 
AT  LEAST  2  CR0UI 


£  I  RM 

CONTROL  AT  DECISION  POINTS 

MAXIMUM  OF  10  SAMPLES   IN  EACH  GROUP 
AT  LEAST  ONE  SRM  MEASUREMENT  SHOULD  BE  MADE  DURING 
EACH  SEQUENCE/DAY 
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Table   2.     Quality  Assessment   Using  Dupl i ca t e/ S p 1 i t s 


CAL I  BRA 

SAM? 

SAMPL 

SAMPL 

SAMPL 


Sequence  Schedule 

FULL  EXPECTED  RANGE 
ION  CHECK   •   MIORANGE  POINT 


SAMP  I 

•  SAMPI 

•  CALIBRATION  CHECK   •  MIDRAI1CE  POIN1 

•  CALIBRATION  CHECK   •   MI DRANGE  POINT/DUPLICATE 

NOTES 

•  -     DECISION  POINT 

1.  MAINTAIN  CONTROL  CHARTS 

a.  DUPLICATE  MI DRANGE  CALIBRATION 

b.  DUPLICATE/SPLIT  SAMPLE 

c.  X-CONTROL  CHARTS,  SRM  AND  IRM 

2.  SYSTEM  MUST  bE  IN  CONTROL  AT  DECISION  POINTS 

3.  IF  MORE  THAN  20  SAMPLES,  REPEAT  SEQUENCE 

4.  IF  LESS  THAN  20  SAMPLES,  DIVIDE  INTO  TWO  CROUPS  AND 

follow  Similar  pun 
s.  at  least  one  srm  measurement  should  be  made  our1nc  each 
sequence  day 
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4 .    Reference  Materials 


4.1      Role  of  Reference  Materials 


In  the  most  general  terminology,  a  reference  material  (RM)  is  a  substance  for  which  one 
or  more  properties  are  established  sufficiently  well  for  use  to  calibrate  a  chemical  analyzer 
or  to  validate  a  measurement  process  (10,11,12).  An  internal  reference  material  (IRM)  is  such 
a  material  developed  by  a  laboratory  for  its  own  internal  use.  An  external  reference  material 
(ERM)  is  one  provided  by  someone  other  than  the  end-user.  A  certified  reference  material 
(CRM)  is  a  RM  issued  and  certified  by  an  organization  generally  accepted  to  be  technically 
competent  to  do  so.  A  Standard  Reference  Material  (SRM)  is  a  certified  reference  material 
issued   by  NBS. 

A  reference  material  is  for  use  in  a  decision  process,  hence  the  requirement  for 
reliability  of  the  value  of  the  property  measured  must  be  consistent  with  the  risk  associated 
with  a  wrong  decision.  The  appropriateness  of  the  reference  material  in  the  decision  process 
must  also  be  considered.  For  some  purposes,  a  simple  substance,  mixture,  or  solution  will  be 
adequate  and  the  value  of  the  property  may  be  calculated  from  the  data  for  its  preparation. 
However,  even  this  is  best  verified  by  suitable  check  measurements  to  avoid  blunders.  Many 
decision  processes  require  a  natural  matrix  reference  material  which  may  necessitate  extensive 
blending  and  homogeni zation  treatments  and  complex  analytical  measurements.  In  such  cases, 
only  a  highly  competent  organization  may  have  the  resources  and  experience  to  do  the  necessary 
wor  k  . 


The  terms  certificate  and  certification  merely  refer  to  the  documentation  that  supports 
the  reference  material.  Guidelines  for  the  content  of  certificates  for  reference  materials 
have  been  prepared  by  the  International  Standards  Organization  (13,14,15).  They  recommend  the 
kind  of  information  the  certificate  should  contain  but  do  not  describe  how  it  should  be 
obtained.  Furthermore,  there  are  no  guidelines  for  judging  the  relative  quality  of  reference 
materi  als  . 


The  guiding  principle  in  issuing  a  SRM  is  that  it  will  be  used  for  measurement  quality 
assessment,  hence  the  property  certified  must  be  accurately  known.  The  uncertainty  in  the 
certified  values  takes  into  account  that  due  to  measurement  and  any  variability  (inhomogene- 
ity)  between  and/or  within  samples  of  the  material  (16).  Definitive  methods  are  used  for 
establishing  the  values  of  the  certified  properties  or  they  are  measured  by  two  or  more  inde- 
pendent reliable  methods  in  which  case  the  results  must  agree  to  minimize  the  chance  for  mea- 
surement bias.  All  certification  measurements  are  described  in  the  certificate  or  are 
referenced.  The  certification  measurements  are  preceeded  by  stability  studies  as  appropriate 
to  set   limits   on   the   life   expectancy   of   the  material. 

4.2     Concept   of  Traceability 

The  concept  of  traceability  to  national  standards  has  been  advanced  in  recent  years  to 
facilitate  i n t er ca 1 i br a t i on  of  laboratories  and  compatibility  of  measurements.  Traceability 
as  related  to  a  standard  may  be  likened  to  genealogy  in  that  it  may  describe  the  chain  of  cal- 
ibrations related  to  establishing  its  value,  including  the  intermediate  standards  that  were 
used  and  the  various  measurements  involved.  In  the  area  of  physical  measurements,  calibra- 
tions of  standards  or  artifacts  with  respect  to  the  national  measurement  standards  can  be  made 
at  NBS  with  high  precision.  These  may  be  made  for  secondary  calibration  laboratories  who  in 
turn  calibrate  standards  for  others,  and  so  forth.  Each  time  this  is  done,  the  uncertainty  is 
increased  due  to  uncertainties  in  a  laboratory's  own  standards  and  propagation  of  the  uncer- 
tainty of  measurement.  Measurement  assurance  programs  (MAP's)  are  designed  to  minimize  the 
latter  and  thus  decrease  the  accumulation  of  uncertainty  as  measurements  go  lower  down  the 
measurement  chain.  The  various  measurements  must  be  made  with  adequate  quality  assurance  if 
reliable  limits  of  uncertainty  are  assignable.  The  responsibility  for  such  is  that  of  the 
measurement  laboratory. 

While  chemical  measurements  rely  to  some  degree  on  physical  measurements  and  require 
calibrated  physical  standards,  very  few  chemical  standards  are  disseminated  in  the  same  manner 
as  the  physical  ones  are.  Hence  it  is  difficult  if  not  impossible  to  establish  the  traceabi- 
lity of  most  chemical  standards  to  other  such  standards  and  especially  to  national  standards. 
An  exception  to  the  above  is  when  SRM's  are  used  either  as  calibrants  for  a  measurement  pro- 
cess or  as  primary  standards  for  chemical  analysis.  All  measurements  using  such  SRM's  have 
the  capability  of  being  traceable  to  a  common  set  of  standards  and  the  i n t er c a  1 i br a t i on  of 
laboratories  is  facilitated.  Relatedly,  certain  commercial  suppliers  are  producing  reference 
materials,  certified  with  respect  to  specific  SRM's  and  protocols  for  issuing  such  materials 
as  Certified   Reference  Materials   have   been   developed  (17). 

While  an  SRM  ordinarily  does  not  provide  traceability  in  its  narrowest  interpretation,  it 
may  serve  a  broader  and  more  useful  function  to  provide  measurement  assurance  which  ensures 
both  proper  calibration  and  acceptable  utilization  of  methodology.  When  specific  SRM's  are 
commonly  used  in  a  systematic  manner,  as  by  means  of  control  charts,  i n t er c al i br a t i on  of  all 
laboratories  using  such  and  compatibility  of  data  may  be  achieved  as  shown  by  Figure  6.  Thus 
measurement  networks  can  specify  the  SRM's  to  be  used  and  the  quality  assessment  procedure  to 
be    followed    to    attain    compatibility    of    monitoring    data,    for    example.      While    acceptable  SRM 
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data  indicate  acceptable  performance  of  the  measurement  system,  discrepant  results  may  not  be 
simple  to  interpret  since  such  could  indicate,  calibration  uncertainties,  application  prob- 
lems, or  both.  One  cannot  rule  out,  completely  misapplication  of  methodology  or  inappropriate 
methodology  as  a  source  of  trouble.  However,  a  well-designed  quality  assurance  program  should 
facilitate   the   identification   of   the   source   of   the  problem. 


NBS 
National 
Standards 


SRM 


/  -1 


NBS 
Measurements 


Compatible 
Data 


User 
Measurements 


Figure   6.     Measurement   compatibility   by   i n t er ca 1 i br a t i on ,    using  SRM's 
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5.     Standard  Reference  Materials 


5.1      Philosophy  of  SRM  Production 

Standard  reference  materials  are  considered  to  be  services  to  the  individual  user  who 
must  pay  the  full  cost  of  the  service  provided.  Costs  of  development,  preparation,  certifica- 
tion, and  marketing  are  accumulated  and  pro-rated  on  the  basis  of  the  number  of  saleable  units 
that  are  produced.  Thus  the  costs  and  benefits  are  prime  considerations  in  authorizing  and 
issuing  an  SRM.  The  production  of  low  demand,  h i gh- pr od uc t i on- c os t ,  and  hence  high  unit-cost 
SRM's   is   accordingly   difficult,    though   not  impossible. 

SRM  production  is  often  preceeded  by  a  substantial  research  effort.  Methodology  may  need 
to  be  developed  or  potential  bias  problems  must  be  solved  if  accurate  certification  is  to  be 
done.  Materials-related  problems  such  as  stability,  homoge ni za t i on  techniques,  and  proper 
conditions  for  packaging  and  storage  may  need  investigation.  Often  the  results  of  such 
research  are  applicable  to  wider  areas  of  science  and  technology  or  at  least  to  broader  areas 
of  SRM  certification.  In  such  cases,  the  costs  of  such  work  may  be  supported  from  general 
research  funds  and  not  charged  to  production  of  an  SRM.  Otherwise  all  costs,  including 
research  and  development,  must  be  recovered  from  sales.  This  increases  the  unit-costs  of 
SRM's  and  impacts  on  the  development  of  new  items  for  which  substantial  research  and 
development   costs   are  necessary. 

5.2     How  an  SRM   is  Produced 

Identification   of  Need 

SRM's  are  developed  to  meet  measurement  needs.  The  need  may  be  specific,  as  the  result 
of  a  regulatory  issue,  or  general  as  the  result  of  a  wide-spread  measurement  problem.  The 
need  may  come  to  the  attention  of  NBS  in  the  form  of  a  specific  request,  or  as  the  result  of 
NBS  scientists'  interactions  with  the  measurement  community.  Because  the  reference  material 
program  must  be  self-supporting,  the  magnitude  of  the  need,  cost  of  development,  and  the  pro- 
spect of  cost  recovery  through  sales  together  with  the  technological  chances  of  success  are 
important   considerations   in   establishing   the   feasibility   of   issuing  an  SRM. 

Determination  of   c har a c t e r i s t i c s / p r o pe r t i es / s pe c i f i c a t i ons 

The  necessary  properties  of  a  useful  reference  material  need  careful  consideration.  The 
kind  and  level  of  parameters  certified,  the  matrix  and  other  physical  characteristics,  homo- 
geneity requirements,  and  the  maximum  acceptable  uncertainties  for  the  certified  values  are 
key  considerations.  While  a  reference  material  is  developed  for  a  specific  use,  it  is  often 
possible  to  extend  its  usefulness  to  other  areas  by  certification  of  additional  parameters  or 
by  modifying  the  matrix.  In  doing  the  above,  it  must  be  considered  that  modification  from  a 
specific  to  a  generic  standard  could  possibly  limit  its  usefulness  for  the  initial  purpose 
while  not  significantly  extending  its  areas  of  application.  Moreover,  the  certification  of 
additional  parameters  can  increase  costs  unless  compensated  by  sufficient  additional  sales, 
and  users  do  not  ordinarily  like  to  pay  for  information  (in  this  case  certificate  values) 
which   is   not   of   direct   use   to  them. 

From    considering    factors    such    as    those    discussed    above,    and    from    discussions    with  the 
user    community,    minimum    specifications    for    a    candidate    SRM    may    be    drafted.       These    may  take 
into   consideration   materials   available  on   the   market   or   suitable  materials  may   need   to  be  pro- 
duced  to  meet   the   specifications.      In   some   cases,    NBS  must   prepare   the  material   or   at   least  a 
prototype   for   initial  testing. 

Often,  preliminary  research  and  development  efforts  are  necessary  to  evaluate  the 
feasibility   of   production   of  SRM's   and/or   to   develop  specifications. 

Preliminary  Studies 

After  a  material  has  been  obtained,  measurements  are  made  to  evaluate  its  compliance  with 
the  specifications.  While  the  exact  level  of  the  analyte  is  often  not  a  controlling  require- 
ment, homogeneity  is  always  an  important  requirement  including  both  within  and  between  units 
of  issue.  Ordinarily,  it  is  desirable  to  certify  the  material  as  a  lot  rather  than  as 
individual    items,    in  which   case   homogeneity   between   units   of   issue   must   be  acceptable. 

Homogeneity 

Homogeneity  evaluation  may  be  done  in  two  phases.  Preliminary  measurements  may  need  to 
be  made  to  accept  material  for  conformance  with  specifications  and  to  decide  on  such  questions 
as  pre-mixing  and  subdivision  into  units  of  issue  (e.g.,  bottling)  prior  to  certification 
analysis.  When  a  mul t i component/ par ameter  SRM  is  involved,  this  can  be  a  major  undertaking  if 
homogeneity  determinations  for  each  co ns t i t uen t / pr o pe r t y  are  to  be  undertaken  at  this  time. 
When  possible,  a  quick  and  precise  method  is  sought  to  evaluate  homogeneity.  In  multiparam- 
eter materials,  this  may  not  be  possible  for  each  component  in  which  case  initial  homogeneity 
may   need   to   be  judged   on   the   basis   of   that   of   a   limited   number   of   typical  constituents. 
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Final  homogeneity  evaluation  is  made  from  interpretation  of  certification  data  on  each 
individual  constituent  or  property.  This  requires  design  and  execution  of  the  measurement 
program   so   that   variance   of   measurement   and   sample   composition   can   be   individually  evaluated. 


Measurement 


The  certification  measurements  are  conducted  according  to  a  quality  assurance  plan 
established  before  the  work  is  actually  begun.  This  requires  development  of  a  statistical 
plan  for  sampling  and  measurement,  selection  of  methodology  which  has  been  demonstrated  to  be 
reliable,  maintenance  of  statistical  control  of  the  measurement  process,  and  quality 
assessment   of   the   data   by   concurrent   measurement   of   suitable  reference  materials   as  possible. 

The  methodology  is  selected  on  the  basis  of  the  following  considerations.  When  possible, 
the  attainable  accuracy  of  measurement  should  be  better  than  that  required  for  use  of  the 
data.  The  first  choice  for  methodology  is  a  method  of  known  and  demonstr atable  accuracy.  The 
term  "definitive  method"  has  been  coined  for  such  and  is  finding  considerable  usage,  especi- 
ally in  relation  to  reference  material  analysis.  A  definitive  method  is  one  based  on  sound 
theoretical  principles  and  which  has  been  experimentally  demonstrated  to  have  negligible  sys- 
tematic errors  and  a  high  level  of  precision.  While  a  technique,  that  is  to  say  a  measurement 
principle,  may  be  conceptually  definitive,  a  method  based  on  such  a  technique,  must  be 
demonstrated   to   deserve  such  a   status   for   each   individual  application. 

An  example  of  a  definitive  technique  is  isotope  dilution  mass  spectrometry  for  trace 
analysis  in  which  one  relates  the  concentration  of  unknown  samples  directly  to  the  actual 
weights  of  spikes  of  isotopes  or  isotopically  labeled  compounds.  A  mass  spectrometer  is  used 
to  measure  isotopic  ratios,  obviating  the  need  for  instrumental  corrections.  The  only  theore- 
tical uncertainty  in  such  a  process  is  the  question  of  the  ability  to  recover  a  natural 
analyte  as  compared  with  that  of  a  spike.  The  accuracy  attainable  will  depend  on  isotopic  and 
chemical   purity  of   the   spike   and   the   care   used   in   preparation   and  measurement. 

Examples  of  other  definitive  techniques  are  gravimetry  and  coulometry.  Both  are  based  on 
fundamental  measurements  that  can  be  made  with  high  accuracy.  As  in  the  case  of  any  methodo- 
logy, it  must  be  demonstrated  that  no  significant  systematic  errors  are  relatable  or  present 
in  their  use  in  a  specific  application.  When  using  such,  possible  biases  of  application  are 
minimized  by  the  use  of  multiple  analysts/instruments  to  the  extent  possible.  Redundancy  of 
measurements   in  random   sequence   is   another   technique   to   avoid   application  bias. 

Because  definitive  methods  are  not  always  available,  the  mul t i - t e c hni q ue  approach  is  one 
often  used  in  certification  of  SRM's.  Parameters  are  measured  by  at  least  two  independent 
techniques  when  possible  and  such  measurements  must  agree  within  reasonable  limits  to  permit 
certification.  Whenever  significant  discrepancies  occur,  additional  work  is  carried  out  to 
reconcile  them,  otherwise  the  values  cannot  be  certified  but  may  be  reported  for  informational 
purposes ,  only. 

Another  mode  of  certification,  which  may  be  called  the  m ul t i - 1 abor a t or y  approach,  is  used 
for  renewal  of  certain  compositional  SRM's.  A  group  of  laboratories  of  recognized  competence 
use  methods  of  proven  accuracy  and  the  corresponding  existing  SRM  as  a  control  to  analyze  a 
renewal  SRM.  Any  significant  discrepancies  are  resolved  by  careful  scrutiny  of  the  data  or  by 
reanalysis   using   the   same  or   independent  methodology. 

In  a  few  cases,  SRM's  are  certified  for  the  value  of  a  constituent  or  property  that  is 
method  dependent,  because  existing  technology  requires  such.  An  example  of  such  is  the 
Kjeldahl  nitrogen  value.  In  such  cases,  demonstration  of  statistical  control  of  the  measure- 
ment process  and  agreement  of  results  by  independent  analysts  is  a  requirement  for 
certification. 


Evaluation  of  Data 


All  SRM  data  are  given  a  thorough  statistical  analysis  to  establish  limits  of 
uncertainty.  This  will  include  that  due  to  measurement  and  to  any  variability  of  the  units  of 
issue.  The  advance  cooperation  of  statisticians  in  planning  the  experimental  programs  is 
essential  if  the  proper  measurements  are  to  be  made  to  enable  a  thorough  evaluation  of  the 
reported  values  for  the  SRM.  The  interpretation  of  the  certified  values  is  discussed  later  in 
this   handbook    (see   5.4   and  5.5). 


Fol 1 ow- up 


SRM  production  is  ordinarily  preceeded  by  studies  of  the  stability  of  candidate 
materials.  When  possible,  only  materials  which  have  a  long  shelf  life  are  selected.  If  there 
is  any  limitation  on  stability,  it  is  indicated  on  the  certificate.  In  addition,  NBS  does 
'shelf  life'  analysis  on  certain  SRM's  when  it  is  considered  that  some  de t er i or i a t i on  could  be 
possible  . 
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NBS  ordinarily  prepares  a  SRM  in  sufficient  quantity  so  that  a  several-year  supply  is 
available,  at  the  time  of  issue,  based  on  anticipated  demand.  Sometimes  demand  exceeds  expect- 
ations. A  few  SRM's  are  prepared  in  limited  lots  due  to  various  considerations,  such  as  shelf 
life,    for   example . 

NBS  aims  to  keep  most  SRM's  in  stock  for  ready  issue  and  to  renew  SRM's  before  the  stock 
is  depleted.  However,  unanticipated  demands  can  cause  delays,  and  changes  in  technology  may 
cause  cancellation  of  plans  for  continual  stocking  because  of  conflicting  priorities  of 
competitive  items. 

It  is  the  further  aim  of  NBS  to  make  SRM's  as  useful  as  possible  to  purchasers.  Customer 
service  can  be  provided  in  many  cases  including  advice  on  use.  In  the  case  of  questions 
arising  from  use,  inquiry  to  the  Office  of  Standard  Reference  Materials  (OSRM)  will  get  quick 
response  and  every  effort  will  be  made  to  provide  satisfactory  solutions  to  application 
problems . 

A  SRM  is  ordinarily  certified  using  state-of-the-art  methodology.  This  may  be  the 
methodology  widely  used  in  practical  analysis  with  special  care  given  to  calibration,  to  qua- 
lity control,  and  to  elimination  of  sources  of  bias.  In  other  cases,  the  methodology  may  be 
suitable  only  for  research  laboratory  use.  For  example,  isotope  dilution  mass  spectrometry  is 
often  used-which  would  be  an  inappropriate  routine  technique,  due  to  time  and  cost  considera- 
tions. In  any  case,  the  certified  value  is  ordinarily  independent  of  the  method  of  measure- 
ment. When  certified  values  are  method  dependent,  the  methodology  used  in  certification  is 
always  named  in  the  certificate  together  with  references  where  detailed  information  can  be 
found.  For  a  number  of  SRM's,  a  so-called  NBS  260  publication  describes  the  measurement 
process   in   detail    (see   front   of   this   publication   for   current  listing). 

5.3     Differences  Among  Measurement  Methods 

Agreement  of  measured  values  by  two  or  more  independent  measurement  methods  is  one  of  the 
approved  conditions  for  certification.  Of  course,  measured  values  never  agree  perfectly  so 
the  statistical  significance  of  disagreement  must  be  considered.  Results  may  not  agree  within 
their   respective   uncertainty   for   several  reasons. 

1.  Matrix   effects   in   one   or   each   method   may   not    be   fully    compensated    by   the  calibration 
procedure  used. 

2.  Systematic  errors  may  not   be  fully  compensated  or   unsuspected  ones  may  exist. 

3.  It    is    in   the   nature   of    things    that    the   more    precise    one    can   make    a   measurement,  the 
smaller   the  difference  that   can   be  detected. 

4.  A    fourth   reason   for   two   methods    to   disagree    is    perhaps    the   most    common    cause   -  the 
standard   deviation  of   one  method,   or   both,    is  underestimated. 

In  any  event,  systematic  differences  in  measured  values  must  be  examined  for  their 
practical  significance  and  values  are  not  certified  unless  reasonable  discrepancies  can  be 
resolved . 

5.1)     Understanding  Certificate  Information 

SRM  certificates  provide  a  variety  of  information  about  the  particular  material. 
Compositional  values  with  uncertainty  limits  are  given  for  all  certified  analytes.  Ordinarily 
the  latter  are  for  the  95?  level  of  confidence  and  include  allowances  for  the  uncertainties  of 
known  sources  of  systematic  error  as  well  as  the  random  error  of  measurement.  Many  certifi- 
cates also  will  include  values  for  other  parameters  or  analytes  which  are  reported  for  "infor- 
mational purposes"  only.  These  values  are  so  reported  because  they  were  measured  by  only  one 
technique,  they  are  the  results  from  discrepant  measurements  by  several  techniques,  or  there 
are  homogeneity  problems  which  detract  from  their  analytical  usefulness.  Such  values  may  have 
uncertainty  values  assigned  to  them,  as  well,  but  they  represent  the  analysts'  best  judgment 
of   the   random   error  uncertainty. 

The  certificate  sometimes  will  describe  restrictions  in  the  use  of  the  sample  which  must 
be  adhered  to  for  reliable  results.  One  of  these  concerns  drying.  Whenever  this  is  critical, 
instructions  for  doing  so  are  included  and  must  be  followed.  In  the  case  of  several  SRM's, 
some  elements  must  be  determined  on  pre-dried  samples  while  others  are  determined  for  moist 
samples   with  subsequent   correction   to   dry  weight,    based   on  a  moisture  determination. 

For  heterogeneous  materials,  the  minimum  weight  of  an  analytical  sample  may  be  specified. 
This   requirement   should   be   followed   if   certified   values   are   to   be  realized. 

In  the  case  of  some  SRM's,  segregation  is  a  potential  or  actual  problem  in  that  the 
material,  though  mixed  at  the  time  of  certification,  may  segregate  on  standing.  The  certifi- 
cate may  instruct  the  user  to  shake,  rotate,  stir,  or  otherwise  reconstitute  the  material. 
Failure  to  do  so  will  not  only  invalidate  the  present  measurement  but  may  jeopordize  further 
measurement   from   the  same   container,    due   to   disproportionate  withdrawal   of  constituents. 
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Storage  of  some  SRM's  may  need  to  be  done  under  prescribed  conditions.  Refrigeration 
and/or  freezing  may  be  necessary,  and  protection  from  moisture,  once  opened,  or  from  radiant 
energy  may  be  necessary.  Because  of  such  problems,  some  SRM's  are  certified  for  first  use, 
only  . 

In  some  cases  certification  is  valid  only  for  a  finite  lifetime  (e.g.,  1  year,  5  years). 
This  is  to  limit  NBS  liability  to  individually  notify  users  in  case  there  is  a  change  in  the 
material.  The  lifetime  is  always  calculated  from  the  time  of  shipment  --  it  is  never  related 
to   packaging   dates  marked  on   the  container. 

These  and  other  restrictions  are  necessary  to  protect  the  integrity  of  the  sample  or  to 
ensure  results  that  will  be  consistent  with  the  certified  values.  NBS  can  accept  no  responsi- 
bility for   validity  of   the   material   if   such   instructions   are   not   faithfully  followed. 

5.5     Uncertainty  of   Certified  Values1 

For     the     purpose  of   this   discussion,      SRM   certification   can   be   divided   into   two  classes: 

A.  Each   unit   is  measured   and   carries    its   own   value    (e.g.,    permeation  tubes). 

B.  Samples    chosen    statistically    from    the    lot    of    SRM    are    measured,    and    one  certificate 
gives   the   value   for   all   units    (most   chemical  types). 

The  uncertainty  of  the  certified  values  for  group  A  SRM's  will  depend  entirely  on  the 
uncertainty  of  measurement.  This  will  be  based  on  the  standard  deviation  and  best  estimates 
of  uncertainties  of  the  systematic  errors  which  have  been  corrected  for,  to  the  extent 
poss  i  bl e . 

For  group  B  SRM's,  any  differences  in  the  certified  property  within  the  units  or  between 
units  of  the  lot  poses  a  problem.  A  sampling  and  measurement  scheme  has  to  be  devised  to 
determine  whether  i nhomogen e i t y  exists  and  to  estimate  its  magnitude  whenever  it  is  important 
to  the   use   of   the  SRM. 


Homogeneity  checks  may  be  made  using  two  different  sampling  schemes.  In  one,  a  batch  of 
material  is  subsampled,  using  a  statistically  developed  scheme,  and  measurements  are  made  to 
detect  significant  differences  in  the  compositions  of  the  samples.  This  has  the  advantage 
that  grossly  heterogeneous  material  would  be  rejected,  thus  saving  the  time  and  cost  of  pack- 
aging unacceptable  material.  It  has  the  disadvantage  that  further  heterogeneity  could  be 
introduced  in  acceptable  material  by  segregation,  discrimination,  or  contamination  during 
packaging.  For  material  believed  to  be  essentially  homogeneous,  bulk  examination  may  be  the 
method  of  choice.  Once  homogeneity  is  confirmed,  the  material  may  be  analyzed  and  packaged  as 
required,   which   could   have   advantages   in   some  cases. 

Material  which  is  considered  to  have  measurable  heterogeneity  is  best  checked  after 
packaging  into  bottles,  vials,  or  whatever.  Not  only  is  it  possible  to  detect  original 
heterogeneity  but  also  any  that  might  result  from  the  packaging  process.  Three  kinds  of 
heterogeneity  are   generally  possible: 


a.      Between   units   vs.    within  units 


b.  Trend   or    pattern  —  along   a   rod,    within   a   sheet    or    block   of   material,    in   the   order  of 
preparation ,    etc . 

c.  Between    blocks    of    units   —   processed    on    different    days,    between    dr urns  ,  between  lots, 
etc  . 


The  sampling  scheme  used  for  each  SRM  depends  to  a  large  extent  on  the  subject  expert's 
knowledge  and  experience  of  what  particlar  type  of  heterogeneity  is  most  likely  to  occur,  and 
the  sampling  scheme  will  be  designed  predominately  to  check  on  variability  due  to  that  source. 
A  knowledge  of  the  details  of  the  packaging  process  is  also  required,  such  as  the  sequence  of 
filling  bottles   or   the   order    in  which   specimens   were   cut   from   a  massive  material. 


If  the  material  is  found  to  be  essentially  homogeneous,  it  is  accepted;  if  the  material 
shows  large  variability,  it  is  rejected.  Often  the  variability  is  at  about  the  level  of  what 
can  be  detected  by  a  particular  analytical  method.  In  that  case,  the  analytical  error  and  the 
heterogeneity  of  the  material  both  contribute  to  the  uncertainty  in  the  final  product  —  SRM 
units  . 


Let  om  be  the  standard  deviation  of  the  analytical  method,  and  oc  be  the  standard 
deviation  of  the  value  of  the  individual  units  about  the  mean  value  of  the  lot.  Then  the 
standard   deviation   o   of   a  single  measurement   on  a   unit,   drawn   at   random   from   the   lot   would  be 


From  a  lecture  by  H.  H.  Ku,  National  Bureau  of  Standards,  presented  at  Precision  and  Accuracy, 
Seminar,    27   March    1980.      See   also  Ref.    10,    p.    296,    and   Ref.  12. 


Reliable      estimates      of      two     of      the  three 


sigmas      allow      an      estimate      of      the  third. 


When  possible,  om  is  evaluated  independently  of  the  measurement  of  the  SRM.  In  this 
case,  the  standard  deviation,  o,  of  the  measured  values  of  individual  samples,  together  with 
the   value   for    om,    permits   an   estimation   of    oQ . 

Often,  possible  matrix  effects  considerations  preclude  or  raise  questions  about  the 
independent  estimation  of  om.  And  even  if  om  can  be  assumed  to  have  a  certain  value,  it  may 
be   necessary   to   verify   it   for   the  SRM   certification  measurements. 

By  duplicate  measurements  of  portions  of  n  randomly  selected  samples,  both  om  and  oc  may 
be  estimated.  Let  xi  and  y^  be  the  first  and  second  measurements  on  portions  of  the  ith  unit 
(bottle,    disc,    sub-sample,    etc.).      The   results  may   be   tabulated  as: 


2  -  y2 


(x?   +   y?)    =  z- 


2^2 


yn>  = 


s     estimates  o 


2 


estimates  o. 


Another  way  in  which  oc  may  be  estimated  is  by  the  use  of  an  independent  method  of 
measurement.  Occasionally  a  highly  precise  method  is  available  for  i n t er com par i son  of  samples 
but  may  not  be  feasible  for  use  for  certification  measurement,  due  to  calibration  problems. 
Such  a  procedure  is  especially  useful  for  confirmation  of  the  homogeneity  status  of  a 
candidate  SRM. 

In  any  evaluation  of  homogeneity,  it  must  be  remembered  that  each  analyte  certified  must 
be  individually  examined  for  homogeneity  considerations.  It  is  not  justifiable  to  extend  con- 
clusions on  the  homogeneity  for  one  analyte  to  another  even  though  they  may  be  closely  related 
i  n   other   respects  . 

Homogeneity  statements  always  must  be  coupled  with  the  size  (mass)  of  sample  to  be  used. 
Basically,  heterogeneous  materials  such  as  bulk  solids  may  exhibit  gross  heterogeneity  as  the 
sample  size  is  diminished  (see  for  example  Appendix  D.2).  In  crushed  material,  for  example, 
individual  particles  could  have  widely  different  compositions.  Apparent  homogeneity  is 
improved  as  larger  sample  sizes  are  considered  since  individual  differences  will  be  averaged 
out.  Accordingly,  the  minimum  sample  size  necessary  to  realize  the  certified  values  often 
will  be  specified  in  the  certificate.  The  NBS  certified  values  cannot  be  extended  to  the 
composition   of   subsamples   smaller   than   the   recommended  size. 

Materials  of  group  B  which  are  found  to  be  essentially  homogeneous  are  certified  on  a  lot 
or  bulk  basis.  All  sub-samples  are  considered  to  have  the  same  composition  which  is  certified 
together  with  a  confidence  interval  based  on  estimates  of  uncertainties  for  both  random  and 
systematic   errors   of  measurement. 

Materials  with  significant  but  usable  levels  of  i nhomogen e i t y  may  be  certified  as  a 
batch,  in  which  case  the  statistical  tolerance  limits  are  given.  This  includes  the  average 
value  of  all  samples  in  the  batch  and  limits  within  which  individual  samples  are  expected  to 
lie,  with  a  stated  confidence.  Because  only  a  limited  number  of  samples  have  been  analyzed, 
one  cannot  say,  with  certainty,  the  limits  for  a  given  percentage  of  samples  but  only  a  pro- 
bability that  the  limits  are  valid.  Thus  one  could  say  that  there  is  a  95  percent  confidence 
that  95  percent  of  the  samples  in  the  batch  lie  within  some  specified  limits  (statistical 
tolerance  limits)    and   this   is   often  what   is   stated   on  a   certificate   for   this   kind   of  material. 
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In  the  case  of  most  granular  SRM's,  the  within  uni t-of -issue  heterogeneity  is  essentially 
the  same  for  all  un i ts-of - i ss ue  and  no  significant  average  difference  is  to  be  expected 
between  individual  units.  In  such  a  case,  the  average  composition  of  sub-samples  within  all 
units  would  not  be  expected  to  differ,  significantly.  In  using  such  SRM's  it  should  be 
remembered  that  the  composition  of  any  sub-sample  is  expected  to  be  within  the  tolerance 
limits  and  the  average  composition  of  a  number  of  sub-samples  is  expected  to  approach  the 
certified  value  . 
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6.      Use   of  SRM'a 
6.1     Kinds  of  SRM's 


Standard  Reference  Materials  may  be  described  as  well-characterized  and  certified 
materials,  produced  in  quantity,  to  improve  measurement  science  and  technology.  They  fall 
into  three  general  categories  (see  Table  3):  (a)  certified  chemical  composition/purity  stan- 
dards; (b)  certified  physical  property  standards;  (c)  engineering  type  standards.  Categories 
(a)  and  (b)  may  be  further  subdivided  into  those  materials  related  to  basic  measurements  and 
those  to  applied  measurements. 

Table   3-      Inventory   of  SRM's,    -   1969   -  1984 
§RM_Categor_£  Number_of _SRNP  s        1£69       1979  1_984 


Chemical  Compos i t i on/ Pur i ty 

Steels   and   steel-making   alloys  141  161  159 

Cast,    white,    ductile,    and   blast   furnace   irons  16  23  25 

Nonferrous   metals   and   alloys  75  130  120 

Gases  in  metals  9  29  13 

High-purity  metals  1  8  8 

Electron   probe   m i croanal y t i c al  0  6  12 

High   purity   chem i ca 1 / mi cr ochem i cal  16  26  18 

Clinical   analysis  1  22  28 

Biological/botanical  0  9  9 

Environmental   analysis    (gases,    liquids,    solids)  3  72  109 

Forensic  analysis  0  3  2 

Oil-soluble  metallo-or ganic   compounds  24  24  22 

Fertilizers  0  3  3 

Ores  10  22  21 

Cements  5  7  9 

Minerals,    refractories,    carbides,    glass  17  22  29 

Glass   trace   elements  0  13  8 

Special   nuclear  materials  18  33  30 

Isotopic  reference   standards   9   J_5  _T4 

Subtotals  345  628  639 

Physical  Properties 

Ion  activity  6  17  13 

Physical   properties   of   glass  7  13  15 

Elasticity  0  1  1 

Density  4  3  2 

Polymer   molecular   weight  2  7  10 

Polymer   rheology  0  1  1 

Temperature   fixed   points  5  12  14 

Calorimetry  3  13  8 

Thermometers/thermocouples  0  4  4 

Vapor  pressure  0  4  3 

Thermal   co nd uc t i v i t y / e xp a n s i on  0  29  16 

Thermal   resistance  0  1  10 

Magnetic  properties  0  11  10 

Optical   properties  4  20  24 

Radioactivity  40  1 56a  1 1 8a 

Permittivity  6  7  6 

Electrical   resistivity   0   J_4  1  5 

Subtotals  177  333  270 

Special  Engineering  Properties 

Standard  rubbers  20  14  12 

Computer  tapes  0  3  4 

Optical   character   recognition  0  4  1 

Sieve   sizing  3  5  4 

Cement   t ur b i d i me t r i c  and   fineness  1  1  1 

Color  36  2  2 

Fading  standards  4  4  1 

Fl ammab i 1 i ty/ smoke   density  1  3  3 

X-ray   and   photographic   step   tablets  1  4  4 

Tear   resistance-tape   adhesion  1  1  1 

Metal   coating   thickness  35  32  10 

Octane   rating   0  2   0 

Subtotals  102  76  43 

Total  544  1037  952 


aMany   of   the   radioactivity   SRM's   consist   of   short-lived   isotopes   and  are   available  only  on 


special   order   for   limited   time   periods   during   the  year. 
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Industrial  materials  that  must  be  analyzed  frequently  for  quality  control  of  production 
processes  constitute  a  major  fraction  of  all  SRM's.  Predominant  in  this  group  are  metals  in 
which  essentially  all  major  alloy  types  are  specifically  represented.  Ores,  minerals,  cement, 
glass,  and  ceramics  are  also  included.  Ordinarily,  these  SRM's  are  sampled  directly  from  the 
container  and  analyzed  by  the  user  in  the  same  manner  as  the  day-to-day  samples  of  the  labora- 
tory. The  results  of  such  SRM  analyses  are  frequently  control-charted  to  monitor  the 
measurement  process. 

High  purity  chemicals  for  use  as  primary  standards  in  a  wide  variety  of  chemical  analyses 
constitute  an  important  group  of  SRM's.  Typically,  the  user  will  prepare  solutions  from  these 
materials  that  may  be  used  directly  in  the  analytical  process  or  for  standardizing  other 
analytical  reagents. 

Clinical  laboratory  standards  compose  a  large  and  growing  group  of  SRM's.  Originally, 
such  SRM's  consisted  largely  of  pure  substances  from  which  spikes  or  calibration  solutions 
could  be  prepared.  They  are  now  being  augmented  by  natural  matrix  materials  containing 
analytes   of   interest   that   are   analyzed   directly  without   preliminary  preparation. 

The  environmental  group  satisfies  most  of  the  routine  monitoring  requirements  and  some 
special  situations  as  well.  SRM's  related  to  all  of  the  criteria  air  pollutant  analyses  were 
introduced  early  into  the  program.  These  were  followed  by  SRM's  related  to  the  measurement 
of  emissions  from  mobile  and  stationary  sources,  priority  pollutants,  and  hazardous  wastes. 
The  SRM's  for  the  gaseous  pollutants  of  the  atmosphere  are  particularly  slanted  toward  the 
calibration  of  analyzers  and  to  provide  traceability  for  i nd us t r i al 1 y- prod uced  working  stan- 
dards. Because  of  the  wide  variety  of  sample  types  and  the  number  of  constituents  of  inter- 
est, it  is  virtually  impossible  to  provide  matrix  matches  for  most  of  the  samples  encountered 
by  organic  analysts.  Accordingly,  the  SRM's  in  this  area  represent  either  high-priority 
sample  types  or  generic  materials  that  should  be  widely  applicable.  Some  SRM's  useful  for 
spiking  or   other   types   of   standards   preparation  are   also  available. 

A  number  of  natural  matrix  SRM's  certified  for  most  of  the  inorganic  constituents  of 
environmental  interest  and  some  organic  substances  have  been  produced.  These  include  several 
biological  matrix  samples  (orchard  leaves,  now  replace  by  citrus  leaves  was  the  first  SRM  in 
this  group)  and  also  urban  particulate  matter  and  river  and  marine  sediments.  Industrial 
hygiene  analysis  materials  are  a  small  but  important  sub-group  in  the  environmental  category. 
This   list   will   be   augmented   as   possible,   and  as   demand   is   shown   for   additional  items. 

The   physical   property   standards   reflect  many   of   the   kinds   of   measurements  made   in  testing 
laboratories.      The   gamut    runs    from   those   useful    for    the    conventional    physical   measurements  of 
temperature,    melting    point,    vapor    pressure,    calorimetry,    conductivity,    and    thermal  expansion 
to  color,    thickness   of   el ec t r ode pos i t s ,    and   fineness   of   powders.  Radioactivity 
standards   are   also   classified   in   this  category. 

Engineering  standards  are  a  small  but  growing  group  which  is  rather  diverse.  Standard 
rubbers,  SRM's  for  evaluating  the  performance  of  magnetic  tapes,  and  flammability  standards 
are  examples. 

NBS  Special  Publication  260,  Catalogue  of  Standard  Reference  Materials,  lists  all  SRM's, 
research  materials,  and  special  reference  materials  that  are  available  and  those  that  are  in 
progress  at  the  time  it  is  issued.  The  catalogue  is  updated  periodically,  and  supplements  are 
issued   in   the   interim.     See   page   3   for   how   to  receive   a  copy. 

The  SRM  program  tries  to  keep  abreast  of  and  even  anticipate  changes  in  technology,  since 
it  may  take  a  period  of  several  years  to  develop  and  certify  a  new  SRM.  Input  from  the  user 
community,  from  contacts  with  professional  colleagues,  and  NBS's  own  measurement  experience 
are  influential  in  guiding  and  establishing  priorities  for  SRM  development.  SRM  sales  provide 
guidance  on  inventory  maintenance  and  priorities  when  questions  of  renewal  must  be  decided. 
While  present  contacts  are  extensive,  additional  input  to  the  decision  process  is  sought.  It 
is  believed  that  there  are  areas  of  technology,  and  especially  new  technologies,  where  SRM's 
are  not  significantly  utilized  and  which  would  benefit  from  their  use.  Information  about 
these  is  especially  solicited. 

Table  4  lists  the  major  areas  into  which  SRM  development  is  expected  to  expand  in  the 
near  future.  The  list  may  be  revised  as  technology  advances  and  as  new  technologies  appear. 
User  input  is  actively  sought  to  confirm  the  wisdom  of  the  proposed  plans,  or  to  provide 
information  on  which  revisions  might   be  considered. 

While  SRM's  constitute  the  largest  category,  two  other  types  of  reference  materials 
described   below  are   produced   or   distributed   by  NBS. 

Research  Materials  (RM's)  are  in  addition  to  and  distinct  from  the  SRM's  issued  by  NBS. 
The  distinctions  between  Research  Materials  and  Standard  Reference  Materials  are  in  the  infor- 
mation supplied  with  them  and  the  purpose  for  which  they  should  be  used.  Unlike  SRM's,  the 
RM's   are   not   issued   with  Certificates   of  Analysis;    rather   they   are   accompanied   by  a   "Report  of 
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Table   4.      Some  Major   New  SRM  Needs   Forecast   for    the   P er i od / 1 98 4 - 1  989  . 


Trace  Organic  Analysis 

Organic   pollutants    in   natural  matrices 

Nutrients   and   toxic  substances   in   food   and   agricultural  products 
Enzymes,    individual   proteins,    and   therapeutic  drugs   in   human  serum 
Additives   in  plastics 

Trace   Elemental  Analysis 

High-performance  alloys 
Plastics 

Glass   and   ceramic  materials 
Semiconductor  materials 
Fibers 

Bulk   Compositional  Analysis 
Recycled  materials 

Uranium   and   plutonium   nuclear   fuels    (for   nuclear  safeguards) 
Fibers 

Quantification  of   inorganic   chemical  species 


Density  Standards 

Dimensional  Standards 

Electron  microscope  magnification 
Particle  sizing 
Photomask  linewidth 

Voltage  Standards    (Josephson  Junction) 

Optical  Properties 

Fluorescence 

Reflectance 

Wavelength 

Electronic  Properties 


Nondestructive   Evaluation  Standards 

Radiographic  sharpness 

Visual  acuity 

Dye   penetrant   crack  plate 

Nuclear  Waste  Disposal 

Leach   rate  testing 
Thermal   expansion  and 

conductivity   of   waste  forms 

Polymer   Molecular  Weight 

Star-branched  polystryenes 
Ethylene-propylene  copolymers 


Spreading  and  sheet  resistance 
Glass  dielectric 


Investigation,"  the  sole  authority  of  which  is  the  author  of  the  report.  A  Research  Material 
is  intended  primarily  to  further  scientific  or  technical  research  on  that  particular  material. 
One  of  the  principal  considerations  in  issuing  an  RM  is  to  provide  homogeneous  material  so 
that  an  investigator  in  one  laboratory  can  be  assured  that  the  material  he  has  is  the  same  as 
that  being  investigated  in  a  different  laboratory.  There  are  presently  several  materials  in 
this   category . 

Special  Reference  Materials  called  GM '  s  are  distributed  by  NBS  to  meet  industry  needs. 
These  materials  have  been  standardized  either  by  some  Government  agency  other  than  NBS,  or  by 
some  standards-making  body  such  as  the  American  Society  for  Testing  and  Materials  (ASTM),  the 
American  National  Standards  Institute  (ANSI),  and  the  Organization  for  International  Standard- 
ization (ISO).  For  this  class  of  materials,  NBS  acts  only  as  a  distribution  point  and  does 
not    participate   in   their  standardization. 


6.2     Choosing  an  SRM 


The  large  number  of  SRM's  available  may  cause  some  confusion  as  to  which  to  choose  for  a 
given  purpose.  Obviously,  matrix  match  is  a  major  consideration  since  there  is  little  or  no 
difficulty  in  interpreting  test  results  of  such  materials.  Close  match  is  only  possible  when 
recurring  analysis  of  well-defined  materials  is  of  concern,  i.e.,  large  volume  industrial  pro- 
ducts. However,  all  of  the  SRM's  have  been  developed  as  the  result  of  wide-spread  needs  and 
much  consideration  has  been  given  to  providing  matrices  that  are  either  typical  or  that  can 
satisfy   generic  purposes. 

When  a  matrix  match  is  possible,  analysts  are  advised  to  use  such  SRM's.  In  consideration 
of  this,  many  users  stock  a  relatively  large  number  of  them,  encompassing  the  variety  of  mate- 
rials that  they  expect  to  analyze.  For  many  users,  a  perfect  matrix  match  will  not  be 
possible,    hence    professional    judgement    will    be    required    to    select    the    ones    most    useful  for 
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each  situation.  The  Office  of  Standard  Reference  Materials,  and  especially  the  NBS  scientists 
who  certify  them,  have  special  experience  in  most  of  the  measurement  areas  represented  by  the 
SRM  inventory.  Inquiry  to  OSRM  (see  p.  31*)  will  provide  access  to  the  scientists  who  may  be 
able   to  advise   in   the  selection   and   use  of   appropriate  SRM's   for   a   given  purpose. 

6.3     Use  of  SRM's 

Because  of  the  high  reliability  of  the  certified  values,  Standard  Reference  Materials, 
find  a  wide  variety  of  uses  ranging  from  special  occasions  when  a  material  of  known  properties 
is  needed  to  test  some  aspect  of  measurement,  to  the  continual  quality  assurance  of 
measurement   systems.      Table   5   is   a  summary   of   the   most   common   kinds   of  applications. 


Table   5.     Uses   of  SRM's   in  Measurement  Systems 


Method   Development   and  Evaluation 

Verification   and   evaluation  of   precision   and  accuracy   of   test  methods 

Development   of   reference   test  methods 

Evaluation  of   field  methods 

Validation  of  methods   for   a  specific  use 

Establishment   of  Measurement  Traceability 

Development   of  secondary  reference  materials 
Development   of   traceability  protocols 
Direct   field  use 

Assurance  of   Measurement  Compatibility 

Direct   calibration  of  methods   and  instrumentation 
Internal    ( i nt r al abor a t or y )    quality  assurance 
External    ( i nter labor atory )    quality  assurance 


Any  use  of  a  reference  material  depends  on  the  ability  to  make  valid  inferences  from  the 
measurement  results.  This  involves  the  tacit  assumption  or  demonstrated  evidence  that  the 
material  is  reliable  and  capable  of  challenging  the  measurement  process.  Furthermore,  the 
measurement  process  must  be  known  to  be  in  a  state  of  statistical  control,  since  limited 
measurements   of   the  reference  material   will   be   used   for   predictive   or   evaluative  purposes. 

Examples  of  one-time  uses  of  SRM's  as  known  test  materials  are  numerous.  Whenever  an 
analytical  method  is  developed  or  modified,  a  well- character i zed  test  material  is  needed  to 
evaluate  its  performance  characteristics.  SRM's,  as  appropriate,  are  obvious  choices  in  such 
cases  and  numerous  examples  are  cited  in  the  literature.  A  survey  during  an  18  month  period 
(10)  identified  40  research  articles  citing  the  use  of  SRM's  in  the  development  or  evaluation 
of  a  wide  variety  of  different  methods  for  chemical  analysis  and  in  particular  for  trace  anal- 
ysis. Likewise,  performance  checks  of  instrumentation,  such  as  linearity,  stability,  and 
sensitivity  are  highly  dependent  on  reliable  test  materials.  Here  again,  an  SRM  is  a  logical 
choice   for   such   a  purpose. 

When  new  methodology  is  adopted  by  a  laboratory,  familiarization  measurements  must  be 
made  in  order  to  gain  proficiency  in  its  use.  The  use  of  SRM's  will  eliminate  questions  of 
stability  and  homogeneity  that  might  complicate  the  test  results  when  using  other  materials. 
Some  laboratories  use  SRM's  to  confirm  a  new  analyst's  capability  to  perform  tests  before 
undertaking  measurements   in  their   test  programs. 

When  a  contract  laboratory  is  needed  to  provide  analytical  services  to  an  individual  or 
in  a  monitoring  program,  evidence  of  the  capability  to  do  so  is  often  a  requirement.  The 
analysis  of  test  samples  provided  by  the  client  is  one  approach  to  evaluate  the  competence  of 
candidates.  SRM's  are  virtually  unexcelled  for  this  purpose.  The  only  questions  in  such 
usage  are  the  selection  of  an  appropriate  SRM  and  the  possibility  of  falsification  of  data  due 
to  recognition  of   the   test   sample   as   an  SRM   with  a   known  composition. 

The  use  of  SRM's  for  educational  purposes  should  not  be  overlooked.  Understanding  of 
analytical  chemistry  is  best  demonstrated  by  practical  laboratory  work  in  which  students 
analyze  real  samples.  No  better  ones  are  available  than  SRM's  which  provide  the  opportunity 
to  test   both   the   precision   and   the   accuracy   of   the   analytical  results. 

SRM's  find  use  as  calibrants  in  certain  cases.  For  example,  the  practical  pH  scale  is 
defined  by  NBS  SRM's,  and  SRM's  for  fixed  temperature  points  are  available.  The  use  of  SRM's 
as    primary   chemical    standards    has    already    been    discussed.      Some    industrial    matrix   SRM's,  and 
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particularly  metals,  are  used  for  calibrating  chemical  analyzers.  Many  of  the  standard  meth- 
ods developed  by  ASTM  Committee  E-2  on  Emission  Spectroscopy  can  be  calibrated  directly  with 
appropriate  SRM's.  As  an  example,  ASTM  Standard  Method  E  322  is  used  when  analyzing  low  alloy 
steels  by  X-ray  fluorescence  and  NBS  D-800  and  1200  series  of  low-alloy  steel  SRM's  are 
recommended   as  calibrants. 

One  of  the  major  uses,  and  the  original  driving  force  behind  SRM  development,  is  for  the 
quality  assurance  of  measurement  processes.  When  various  analysts  use  different  methodologies 
(and  even  the  same  methods),  unacceptable  discrepancies  can  arise,  usually  attributable  to 
calibration  or  procedural  differences.  The  analysis  of  commonly  available  reference  materials 
can   identify   such   problems   and   lead   to   their  solution. 


Raid  Applications 


Figure   7.      Systems  Approach   to  Measurement  Accuracy 

This  figure  depicts  a  hierarchical  system  of  measurement  methods  and  reference  materials. 
The  function  of  each  component  (1  to  6)  is  to  transfer  accuracy  to  the  level  immediately  below 
it  and  help  provide  traceability  to  the  level  immediately  above  it,  thus  helping  to  assure 
overall  measurement  compatibility.  Proceeding  from  the  bottom  to  the  top  of  the  measurement 
hierarchy,  accuracy  requirements  increase  at  the  expense  of  decreased  measurement  efficiency. 
At  the  top  are  the  so-called  definitive  methods  of  analysis  or  test,  which  give  the  most  accu- 
rate values  obtainable.  Unfortunately,  most  definitive  methods  (for  example,  gravimetric 
techniques  for  preparing  analyzed  gas  SRM's)  are  usually  time  consuming  and  sophisticated. 
Thus,  they  are  not  economically  acceptable  for  widespread  and  routine  use.  Definitive  meth- 
ods, however,  are  used  whenever  possible  to  certify  NBS  SRM's.  With  these  materials,  accuracy 
can   be   transferred   throughout   a  measurement  system. 

SRM's  are  commonly  used  in  developing  reference  methods  and  assuring  their  accuracy. 
Such  methods  may  be  suitable  for  direct  use.  Alternatively,  they  may  serve  as  a  basis  for 
developing  or  evaluating  other  methods.  Reference  methods  are  also  commonly  used  for  produc- 
ing secondary  reference  materials,  which,  in  turn,  are  directly  used  in  routine  field 
measurement  applications. 

In  principle,  the  accuracy  of  numerous  field  methods  can  be  traced  to  a  definitive  method 
in  a  hierarchical  accuracy-based  measurement  system.  SRM's  and  other  reference  materials  are 
essential  in  the  transfer  of  accuracy.  Also  essential  are  good  methods,  good  laboratory 
practices,    well   qualified   personnel,    and   adequate   quality   assurance  procedures. 

The  complexity  of  modern  chemical  analysis  provides  many  sources  of  error  and 
opportunities  for  introduction  of  bias  and  imprecision.  Accordingly,  such  systems  must  be 
operated  under  a  rigid  quality  assurance  system  if  results  are  to  be  meaningful.  It  is  not 
sufficient  to  check  the  calibration  of  instruments  although  this  is  always  necessary.  Rather, 
the  performance  of  the  entire  system  needs  to  be  monitored  on  a  regular  basis.  SRM's  are 
finding   increasing   use   as   test  materials   to  monitor   system  performance. 
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SRM's  are  best  used  on  a  regular  basis.  The  sporadic  use  of  reference  materials  when 
trouble  is  suspected  is  a  legitimate  use  but  systematic  measurement  in  a  control- chart  mode  of 
operation  will  generally  be  more  informative  and  is  highly  recommended.  SRM's  may  be  used  as 
the  sole  reference  material  or  they  may  be  used  with  internal  reference  materials  in  a 
systematic  manner,    thus   conserving  the   former   and   adding  credence   to  the  latter. 

The  use  of  SRM's  as  quality  assurance  materials  is  discussed  in  more  detail  in  the 
article  contained  in  Appendix  (D.3). 

6.4     Interpretation  of  Reference  Material  Analyses 

Some  SRM's  have  a  matrix  identity  with  test  samples  and  can  be  used  directly  to  establish 
the  response  function  of  chemical  analyzers.  Others  may  be  used  by  a  laboratory  as  their  pri- 
mary standards.  However,  the  majority  of  the  SRM's  are  quality  assurance  materials  and  should 
be   analyzed  regularly  or   on  occasion   to  monitor   the   performance   of   a  measurement  system. 

The  four  general  cases  for  use  of  SRM's  as  quality  assurance  materials  are  illustrated  in 
Figure  8  (a-d).  When  a  matrix  match  is  possible  (8a),  the  uncertainty  in  the  sample  measure- 
ments can  be  equatable  to  that  observed  in  measurement  of  the  SRM.  When  such  a  match  is  not 
possible  but  an  SRM  with  a  related  matrix  is  available  (8b),  the  test  sample  uncertainty  may 
be  relatable  to  those  observed  when  measuring  the  SRM's.  Even  when  the  above  situations  do 
not  apply,  the  measurement  of  an  appropriate  SRM  (8c)  can  monitor  the  measurement  system  and 
its  performance  when  measuring  test  samlples  can  be  inferred  in  many  cases.  When  an  SRM  is 
unavailable  or  not  used,  measurement  uncertainty  must  be  inferred  from  other  evidence  such  as 
physical  calibrations  and  the  experience  of  others,  for  example.  Obviously  it  is  to  the 
advantage  of  a  laboratory  to  evaluate  its  own  performance,  using  SRM's  or  other  reliable 
reference  materials  whenever  possible. 

The  results  obtained  when  analyzing  reference  materials  should  be  interpreted  after  due 
consideration.  When  measured  consistently  and  utilizing  control  charts,  they  can  effectively 
monitor  a  measurement  process.  When  measured  in  isolation  the  results  could  be  inconclusive 
or  even  misleading. 

The  inability  to  correctly  analyze  a  reference  material  may  cast  serious  doubts  on  the 
reliability  of  a  measurement  process  but  provide  no  diagnostic  information.  Also,  the  correct 
analysis  of  an  SRM  may  not  necessarily  indicate  the  converse.  Referral  to  Figure  9  will 
clarify  this  point.  In  this  figure,  measured  values  are  plotted  with  respect  to  the  expected 
values,  certified  values,  for  example.  For  an  unbiased  system  the  data  would  be  represented 
by  line  A . 

Various  kinds  of  linear  measurement  bias  are  illustrated  in  Figure  9.  Line  B  corresponds 
to  a  constant  bias  (negative  in  this  case  but  it  could  be  positive,  as  well)  while  line  C 
results  from  bias  which  is  proportional  to  the  concentration  level  of  the  sample.  The  propor- 
tionality factor  could  be  less  or  greater  than  unity  (shown).  Line  D  results  from  a  combina- 
tion of  constant  and  level- proportional  bias.  It  is  obvious  that  the  measurement  of  one 
reference  sample  will  not  evaluate  the  performance  of  a  measurement  system  throughout  a  con- 
centration range  unless  supplemented  by  other  information.  One  could  even  obtain  a  result,  2, 
and  conclude  -that  a  system  was  unbiased  when  the  analysis  of  additional  reference  materials 
might   indicate  performance  representable  by   line  D,   for  example. 

Occasionally,  situations  may  occur  where  linear  treatment  of  biases  does  not  apply,  but 
these  are  not   described   in  the   present  paper. 

When  possible,  the  analysis  of  several  reference  samples,  spanning  the  concentration 
range  of  interest,  is  the  most  useful  way  to  investigate  measurement  bias.  The  three  sample 
approach  -  analysis  of  a  low,  middle,  and  upper  range  sample  -  is  practical  in  most  cases, 
provided  that  the  reference  samples  are  sufficiently  homogeneous  and  that  the  range  of  analyt- 
ical interest  is  covered.  Bias  is  even  identifiable  using  relatively  non-homogeneous  samples, 
provided  that  a  sufficient  number  are  analyzed.  It  is  highly  unlikely  that  n  randomly 
selected  samples  from  a  lot  would  all  deviate  in  a  systematic  manner  from  a  population  mean 
value,  provided  n  is  5  or  more.  Thus  the  measurement  of  such  should  indicate  the  kind  of  bias, 
that  may  be  present.  Statistical  advice  may  be  necessary,  in  such  cases  to  plan  the  number  of 
samples   and   the  replicate  measurements   needed   and   to   evaluate   the   test  results. 

When  supported  by  other  data,  the  measurement  of  even  a  single  reference  sample  can  be 
meaningful.  Thus  a  knowledge  of  the  standard  deviation  of  measurement,  obtained  from  other 
data,  would  answer  whether  point  (1)  or  point  (2)  could  be  considered  as  represented  by  line 
A.  Measurement  of  a  series  of  non-reference-material  samples  might  provide  some  knowledge 
about  the  slope  and  hence  assist  in  the  interpretation  of  the  SRM  measurement  data.  The  best 
diagnostic  information  would  be  obtained  from  the  measurement  of  a  series  of  SRM's  containing 
graduated  levels  of  analyte.  When  such  are  available,  the  use  of  all  of  them  will  maximize 
the   information  on   the   performance   characteristics   of   an   analytical  system. 
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Expected  Value 


Figure   9.      Identifying  measurement  bias 


An  SRM  may  have  a  reasonable  matrix  match  with  test  samples  but  differ  from  them  in  level 
of  concentration.  If  the  level  of  analyte  in  the  SRM  is  higher  than  that  of  the  test  sample, 
it  may  be  possible  to  quantitatively  dilute  the  SRM.  The  best  diluent  is  the  matrix  of  the 
SRM  but  a  neutral  matrix  may  be  used  in  some  cases.  Two  SRMs  containing  different  levels  of 
an  analyte  may  be  proportionally  mixed  to  obtain  a  series  of  materials,  ranging  from  the  con- 
centration level  of  the  lower  to  that  of  the  higher.  These  techniques  are  described  in  ASTM 
D-3975.  -  Preparation  of  Samples  for  Collaborative  Testing  of  Methods  for  Analysis  of 
Sediments.  T-he  expression  used  to  calculate  the  composition  of  a  blend  of  two  samples,  A  and 
B,    is   as  follows: 


weight  percent    (or  ppm)   of   constituent   a   in   sample  A 

weight  percent    (or  ppm)   of   constituent   a   in   sample  B 

weight  percent    (or  ppm)   of   constituent   a   in  mixture 

weight  of   sample  A  in  mixture 

weight  of   sample   B  in  mixture 


When  a  material  is  diluted  with  a  second  material  containing  an 
the  analyte  of   interest,    the  expression  to  be  used  is: 


nsignificant    amount  of 


weight   of   diluent   sample  mixed  with  WA 
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All  dilutions  must  be  made  with  care.  Because  uniform  mixing  may  be  difficult  to 
achieve,  the  entire  mixture  that  has  been  prepared  may  need  to  be  used,  rather  than  sub- 
samples  of  it.  Despite  such  problems,  the  technique  is  attractive  since  it  can  provide  refer- 
ence materials  that  simulate  the  test  samples  more  closely,  and  to  evaluate  a  measurement 
process   over   a  range   of   concentration   levels,    as   discussed   in  Section  6.4. 

When  possible,  the  analysis  of  several  reference  samples,  spanning  the  concentration 
range  of  interest,  is  the  most  useful  way  to  investigate  measurement  bias.  The  three  sample 
approach  -  analysis  of  a  low,  middle,  and  upper  range  sample  -  is  practical  in  most  cases, 
provided  that  the  reference  samples  are  sufficiently  homogeneous  and  that  the  range  of  analyt- 
ical interest  is  covered.  Bias  is  even  identifiable  using  relatively  non-homogeneous  samples, 
provided  that  a  sufficient  number  are  analyzed.  It  is  highly  unlikely  that  n  randomly 
selected  samples  from  a  lot  would  all  deviate  in  a  systematic  manner  from  a  population  mean 
value,  provided  n  is  5  or  more.  Thus  the  measurement  of  such  should  indicate  the  kind  of  bias, 
that  may  be  present.  Statistical  advice  may  be  necessary,  in  such  cases  to  plan  the  number  of 
samples  and   the  replicate  measurements  needed  and  to  evaluate  the  test  results. 

6.5     Evaluation  of  Measurement  Error 

The  values  measured  by  a  user  for  an  analyte  or  a  parameter  will  rarely  agree  fully  with 
the  certified  value  due  to  uncertainties  in  each.  The  question  naturally  arises  as  to  how 
large  a  difference  is  significant.  This  will  depend  on  the  uncertainty  of  the  measurement  by 
the  user  (see  Figure  10)  and  the  certification  limits  for  the  SRM.  The  former  can  be 
calculated,    using   the   expression    (see   also  Section  C.5): 

X  ±  +  B) 

/n 

(where  X  is  the  mean  of  n  measurements  by  the  user  whose  estimated  standard  deviation  of 
measurement  is  s.  Student's  t  value  will  depend  on  the  number  of  degrees  of  freedom  in  the 
estimation  of  s  (n-1  if  s  is  based  on  the  measurement  of  the  moment)  for  a  95?  confidence 
level  which  is  the  usual  level  for  certified  values  of  an  SRM.  The  value,  B,  is  the  user's 
estimate  of  the  magnitude  of  any  uncorrected  biases  inherent  in  his  measurement  and  is  based 
on  experience  and  professional  judgment. 


Figure   10.      Uncertainty   of  measured  value 


Cm  =  Uncertainty  of  Measured 

B  =  Biases,    Errors   Inherent   in  Measurement 

s  =  Precision  of  Measurement 

i  =  Precision  of  Mean   of   n  Measurements    (really  s~) 
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If  the  confidence  interval  intersects  the  confidence  or  tolerance  interval  of  the  SRM, 
there  is  agreement.  If  not,  then  a  discrepancy  exists  which  should  be  investigated.  In  the 
case  of  heterogeneous  SRM's,  several  sub-samples  may  need  to  be  measured  to  evaluate 
measurement  bias. 

If  an  apparent  discrepancy  is  found,  it  is  advisable  to  look  close  r  at  the  estimates  of 
uncertainty.  Rarely  will  a  user's  uncertainty  (^=)  be  less  than  that  of  the  NBS  measurements 
which  are  done  with  state-of-the-art  techniques.  "Perhaps  there  are  unsuspected  biases  in  the 
user's  laboratory,  which  the  SRM  has  uncovered.  If  an  explanation  cannot  be  found,  the  user 
should  communicate  with  NBS  OSRM  who  will  look  into  the  matter  and  advise  as  possible.  The 
sample  may  have  deteriorated  or  become  contaminated,  so  the  possibility  of  such  may  need  to 
be  considered. 

It  cannot  be  too  strongly  emphasized  that  statistical  control  must  be  attained  before  any 
data  can  be  believed  and  any  errors  identified  or  corrected.  There  is  no  easy  way  to  identify 
assignable  causes  for  either  unacceptable  bias  or  precision,  which  is  the  first  step  for  cor- 
rective actions.  Consultation  with  experienced  users  of  the  methodology  employed  may  be  help- 
ful to  suggest  approaches  to  follow,  if  not  the  specific  solutions.  The  magnitude  of  the 
errors  encountered  may  rule  out  certain  sources  and  indicate  likely  ones.  However,  the 
simultaneous   existence  of   several   sources   of   unacceptable   error   cannot   be  discounted. 

In  diagnosing  error,  it  should  be  remembered  that  random  errors  add  up  in  quadrature, 
which  is  to  say  the  variance  of  random  errors  is  additive,  as  discussed  in  2.1.1.  When  the 
measurement  system  is  well-understood,  it  may  be  possible  to  estimate  the  variance  of  the 
individual  steps  or  operations  of  which  it  is  composed  and  to  compare  such  with  the  magnitude 
of  the  errors  of  concern.  Obviously,  the  step  or  operation  with  largest  variance  is  the  first 
one   to  be   considered  when   other    information   is   not  available. 

On  the  other  hand,  there  is  no  reason  to  believe  that  biases  are  randomly  distributed  but 
rather  they  add  up,  algebraically.  According,  small  systematic  errors  contribute  differently 
than  small  random  errors. 

Whenever  excessive  bias  or  imprecision  is  found  to  be  present,  corrective  action  needs  to 
be  taken,  otherwise  the  measurement  process  will  have  limited  usefulness.  The  first  question 
that  needs  to  be  answered,  in  this  regard,  is  whether  the  unsatisfactory  situation  is  inherent 
to  the  methodology  or  is  due  to  its  application  in  a  given  laboratory  or  even  by  specific  per- 
sons. Collaborative  test  data  and/or  the  research  findings  of  others  may  indicate  the  magni- 
tude of  the  former.  If  the  experience  of  a  laboratory  is  not  consistent  with  this,  excess 
application   imprecision  or   bias   would   be  suspected. 

Factors   to   be   considered   in  reducing  operational    (non-state-of-the-art)   bias  include: 


better   quality   of  calibrants 
improvement   in  calibration 
reducing  contamination 
reducing  mechanical  losses 

reducing  sol ut i on/ ex t r ac t i on  i n ef f i c i ence s 
removal   of  interferences. 


Factors   to  be   considered   in   reducing  operational   random   error  include: 

improvement   of   technical  skills 
improvement   of  manipulative  skills 
improvement   of   environmental  control 
closer   tolerances   in   operational  parameters 
improved  instrumentation 
reducing  variability  of  blanks. 
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7.      Reporting  Analytical  Data 


7.1     Limits  of  Uncertainty  for  Data 

Data  are  of  limited  use  and  even  can  be  useless  unless  limits  of  uncertainty  can  be 
estimated  and  assigned  to  them  [1].  The  limits  should  include  the  limits  of  uncertainty  of 
any  systematic  error  (bias)  and  the  random  errors  of  measurements.  Estimations  of  limits  of 
systematic  error  are  based  on  judgment  and  require  a  full  understanding  of  the  measurement 
system  used.  The  experience  of  the  analytical  community  is  helpful  in  this  respect.  The  ran- 
dom error  component  is  based  on  the  skill  and  experience  of  the  laboratory  making  the 
measurements   and   is   evaluated   from   the   standard   deviation  of   the  measurement  process. 

When  the  measurement  process  is  demonstrated  to  be  in  a  state  of  statistical  control,  the 
process  standard  deviation  may  be  used  to  evaluate  the  confidence  interval  for  the  mean  of  n 
measurements  (see  C.5).  The  use  of  control  charts  is  the  best  way  to  demonstrate  that  a  pro- 
cess is  in  a  state  of  statistical  control  at  the  time  the  measurements  are  made  and  this  can 
minimize   the   amount   of   work   needed   to   establish   confidence   limits   for   the  data. 

In  the  absence  of  a  control  chart,  a  sufficient  number  of  replicate  measurements  must  be 
made  to  demonstrate  statistical  control  and  to  estimate  the  standard  deviation  of  measurement 
with  a  measurable  degree  of  confidence.  The  minimum  number  of  replicate  measurements  required 
is  somewhat  arbitrary  and  depends  on  the  risk  concerned  with  exceeding  the  limits  of  confi- 
dence that  are  stated.  Metrologists  often  recommend  7  to  30  determinations  as  a  reasonable 
number  of  replicates.  The  uncertainty  of  the  standard  deviation  increases  rapidly  below  7  and 
little   is   to   be   gained   by   increasing   the  number   above  30. 

It  is  not  considered  good  practice  to  correct  for  biases  without  understanding  their 
origin.  Biases  such  as  analytical  blanks  should  be  measured  as  accurately  as  necessary  and 
possible  and  the  results  are  corrected  directly  for  them  [18].  This  is  proper  because  analy- 
tical chemists  believe  that  blanks  are  additive  errors.  Biases  such  as  found  when  measuring 
SRM's  are  investigated  to  identify  their  source  so  that  they  can  be  eliminated,  minimized,  or 
corr ected-f or  in  a  proper  manner.  As  already  pointed  out  (see  6.4)  the  method  of  correction 
will  be  dictated  by  the  nature  of  the  bias.  The  most  reliable  approach  is  to  remove  the  bias 
rather   than   to   correct   for  it. 

The  treatment  of  bias  related  to  the  question  of  recovery  often  troubles  the  trace 
analyst.  Usually  corrections  are  not  made  but  recoveries  are  reported  as  one  of  the  qualifi- 
cations for  the  data.  The  recovery,  no  matter  what  its  value,  should  be  shown  to  be  in  a 
state  of  statistical  control,  and  this  should  be  a  requirement  for  reporting  data  of  this 
kind.  When  a  recovery  determination  is  made  and  the  value  obtained  is  variable  or  does  not 
fall  within  control  limits,  corrective  action  is  indicated  and  control  should  be 
re-established   before   data  may   be  reported. 

7.2     Significant  Figures 

Numerical  data  are  often  obtained  (or  at  least  calculations  can  be  made)  with  more  digits 
than  are  justified  by  its  accuracy  or  precision.  So  that  it  is  not  misleading,  such  data  when 
reported  should  be  rounded  to  the  number  of  figures  consistent  with  the  confidence  that  can  be 
placed  on  it  (see  guidelines  for  Reporting  Results  7.3).  Accordingly,  metrologists  have 
adopted  the  terminology  of  significant  figures  in  describing  the  resulting  data.  The  number 
of  significant  figures  is  said  to  be  the  number  of  digits  remaining  when  the  data  is  so 
rounded.  The  last  digit,  or  at  most  the  last  two  digits  are  expected  to  be  the  only  ones  that 
would  be  subject  to  change  on  further  experimentation,  for  example.  Thus,  for  a  measured 
value  of  20.5,  only  the  5  and,  at  most  the  0.5  would  be  expected  to  be  subject  to  change.  Such 
data  would   be   described  as   having   three   significant  figures. 

In  counting  significant  figures,  any  zeros  used  to  locate  a  decimal  point  are  not 
considered  as  significant.  Thus  0.0025  contains  only  two  significant  figures.  Any  zeros  to 
the  right  of  the  digits  are  considered  as  significant,  thus  only  those  that  have  significance 
should  be  retained.  Thus  2500  and  2501  each  have  four  significant  figures.  Zeros  should  not 
be  added  to  the  right  of  significant  digits  to  define  its  magnitude,  unless  they  are  signifi- 
cant, since  this  would  confuse  the  significance  of  the  value.  For  example,  it  is  not  good 
practice  to  report  a  value  as  2500  ng  but  rather  2.5  mg  if  the  data  is  reliable  to  two  signi- 
ficant figures.  The  use  of  exponential  notation,  e.g.,  3-5  x  10^,  is  an  acceptable  way  to 
report  data  with  two  significant  figures  which  would  otherwise  have  to  be  reported  as  3500, 
suggesting   4  significant  figures. 

In  multiplication  and  division,  the  operation  with  the  least  number  of  significant 
figures  determines  the  numbers  to  be  reported  in  the  result.  For  example,  the  product  1256  x 
12.2  =  15323.2  is  reported  as  1.53  x  10  .  In  addition  and  subtraction,  the  least  number  of 
figures  to  either  the  right  or  the  left  of  the  decimal  point  determines  the  number  of  signifi- 
cant figures  to  be  reported.  Thus  the  sum  of  120.05  +  10.1  +  56.323  =  156.473  is  reported  as 
156.5  because  10.1  defines  the  reporting  level.  In  complex  calculations  involving  multiplica- 
tions and  additions,  for  example,  the  operation  is  done  serially,  and  the  final  result  is 
rounded  according  to  the  least  number  of  significant  figures  involved.  Thus  (1256  x  12.2)  + 
125    =    1 .53   x    104    +    1 25    =   1 .54    x  104. 
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The 


following  rules   should   be   used   in  rounding  data,    consistent   with   its  significance: 


When  the  digit  next  beyond  the  one  to  be  retained  is  less  than  five,  the  retained 
figure  is  kept  unchanged.  For  example:  2.541  becomes  2.5  to  two  significant 
figures . 

When  the  digit  next  beyond  the  one  to  be  retained  is  greater  than  five,  the  retained 
figure  is  increased  by  one.  For  example:  2.453  becomes  2.5  to  two  significant 
f  i  gur  es  . 

When  the  digit  next  beyond  the  one  to  be  retained  is  exactly  five,  and  the  retained 
digit  is  even,  it  is  left  unchanged  and  conversely.  Thus,  3.450  becomes  3-4  but  3-550 
becomes    3-6   to   two   significant  figures. 

When  two  or  more  figures  are  to  the  right  of  the  last  figure  to  be  retained,  they  are 
to  be  considered  as  a  group  in  rounding  decisions.  Thus  in  2.4(501 ) ,  the  group  (501 ) 
is   considered   to   be   >5  while   for   2.5(499),    (499)    is   considered   to   be  <5. 


7.3     Guidelines   for   Reporting  Results   of  Measurements 


The  number  of  significant  figures  to  be  used  in  reporting  results  is  often  asked.  This 
will  depend  on  the  number  of  figures  in  the  original  data  and  the  confidence  limits  to  support 
the  results.  Analysts  sometimes  feel  that  observed  data  have  more  digits  than  are  meaningful 
and  are  tempted  to  round  them  to  what  is  felt  to  be  significant.  This  should  be  resisted  and 
rounding  should  be  deferred  as  the  last  operation.  The  following  guidelines  are  recommended 
when   deciding  what    is  significant. 

The  number  of  figures  to  retain  in  experimental  data  and  even  in  preliminary  calculations 
is  unimportant,  provided  a  certain  minimum  is  exceeded.  At  least  the  last  figure  should  vary 
between  successive  trials  and  variability  of  at  least  the  last  two  figures  is  preferred.  If 
this  is  not  the  case,  the  data  are  probably  truncated  by  the  operation  (e.g.,  low  attenua- 
tion), rounded  off  by  the  observer,  or  imprecisely  read.  Training  of  observers  can  often 
improve  the  precision  of  reading.  Observers  can  have  preconceived  ideas  of  the  attainable 
precision  (or  that  required  for  some  application)  and  will  round  off  readings  with  this  in 
mind.     Thus   they  may   be   actually   throwing  away  data. 

The  average  of  several  values  should  be  calculated  with  at  least  one  more  significant 
figure  than  that  of  the  data.  This  will  then  be  rounded  for  reporting,  consistent  with  the 
confidence   limits  estimated. 


The  standard  deviation  (necessary  for  computing  confidence  intervals)  should  be  computed 
to  three  significant  figures  and  rounded  to  two  when  reported  as  data.  The  confidence  inter- 
val should  be  calculated,  then  rounded  to  two  significant  figures  (use  more  than  this  number 
in  the  calculations  as  available)  and  the  result  reported  should  be  consistent  with  this. 
While  any  confidence  level  may  be  used,  the  95  percent  level  is  commonly  used.  However,  the 
level   used   for   the   calculation  must   be  reported. 

As   an   exa'mple  of   the   above,    the   following  data  were  observed: 


5.2,    14.7,    15.1,    15.0,    15.3,    15.2,  14.9 
x   =  15.057 


s   =  0.207 


Confidence   interval  calculation: 


ts  2.517   x  0.207 

-=     =   T=   =  0.1969 

/n  /7 

The  result   reported   is   x   =   15.06   ±  0.20 

where  the  uncertainty  represents  the  95  percent  confidence  interval  for  the  mean  of  seven 
measurements . 
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8.     NBS  Services  Related  to  SRM's 


The  NBS  Office  of  Standard  Reference  Materials  and  the  various  scientists  that  are 
engaged  in  their  development  and  certification  believe  that  the  use  of  SRM's  in  a  systematic 
manner  can  provide  a  high  level  of  confidence  in  analytical  measurement  data.  To  the  extent 
possible,  assistance  will  be  provided  related  to  their  use.  While  detailed  advice  on  special- 
ized measurement  problems  cannot  be  provided,  generic  advice  is  available  and  may  be  all  that 
is  needed  in  many  situations.  A  list  of  the  kinds  of  services  that  can  be  provided  and  the 
telephone  number   to   call   to  make   initial   contacts  follows. 

-  SRM  General 

o       General   catalogues   will   be   sent   on  request 

o       Special    catalogues   will    be   sent    to   those    identifying   special    interests    and  updates 

will   be  sent   as  available 
o       Announcement   of   new  SRM's   are  sent   to  those  on   the   special    interest  lists 
o       Replacement   of   lost   certificates   will   be  made  on  request 

-  Assistance   in  Ordering/Customer  Services 

o  Call  301  -  921  -20145  for  special  assistance  in  ordering,  quotations  on  current 
prices,    and  availability   of  specific  SRM's. 

-  Technical  Assistance 

o  Questions  concerning  applications,  experimental  results  when  measuring  specific 
SRM's,  stability  selection  of  SRM's,  details  of  certification  and  certified  values, 
and  other  technical  matters  are  directed  to  the  appropriate  project  manager  (see  p. 
57)    or   to  Customer  Services. 

-  SRM  Workshops 

o       Quality   Assurance   of   Chemical  Measurements 

A  2-day  seminar,  offered  semi-annually,  ordinarily  announced  through  the  SRM 
mailing   list.     Call    301    -   921-3197   for   current  information. 

o       Special   SRM  Workshops 

Special  workshops  concerned  with  development  of  new  SRM's  or  utilization  of  SRM's. 
Call   Customer  Services 

-  Calibration  Services 

NBS  provides  a  limited  amount  of  calibration  services  in  several  areas  of  physical 
measurement.  The  services  available  and  the  current  test  fee  may  be  obtained  by 
inquiry   to   the   NBS   Office   of  Measurement   Services,    301    -  921-2805. 
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APPENDIX   A.      GLOSSARY   OF  TERMS 


Absolute  Method  -  A  method  in  which  characterization  is  based  entirely  on  physically 
(absolute)   defined  standards. 

Accreditiation  -  A  formal  process  by  which  a  labortory  is  evaluated,  with  respect  to 
established  criteria,  for  its  competence  to  perform  a  specified  kind(s)  of  measurement. 
Also,  the  decision  based  upon  such  a  process.  When  a  certificate  is  issued,  the  process 
is   often   called  certification. 

Accuracy  -  The  degree  of  agreement  of  a  measured  value  with  the  true  or  expected  value  of  the 
quantity  of  concern. 

Aliquant  -  A  divisor  that  does  not  divide  a  sample  into  a  number  of  equal  parts  without  leav- 
ing a  remainder;    a   sample  resulting   from  such  a  divisor. 

Aliquot  -  A  divisor  that  divides  a  sample  into  a  number  of  equal  parts,  leaving  no  remainder; 
a  sample  resulting   from  such  a  divisor. 

Analyte  -  The   specific   component   measured   in   a  chemical   analysis;    also   called  analate. 

Assignable  cause  -  A  cause  believed  to  be  responsible  for  an  identifiable  change  of  precision 
or   accuracy   of   a  measurement  process. 

Blank  -  The  measured  value  obtained  when  a  specified  component  of  a  sample  is  not  present 
during  the  measurement.  In  such  a  case,  the  measured  value/signal  for  the  component  is 
believed  to  be  due  to  artifacts,  hence  should  be  deducted  from  a  measured  value  to  give  a 
net  value  due  to  the  component  contained  in  a  sample.  The  blank  measurement  must  be  made 
so   that   the   correction   process   is  valid. 

Blind  Sample  -  A  sample  submitted  for  analysis  whose  composition  is  known  to  the  submitter  but 
unknown  to  the  analyst.  A  blind  sample  thus  is  one  way  to  test  proficiency  of  a  measure- 
ment process. 

Bias  -  A  systematic  error  inherent  in  a  method  or  caused  by  some  artifact  or  idiosyncrasy  of 
the  measurement  system.  Temperature  effects  and  extraction  inefficiencies  are  examples 
of  this  first  kind.  Blanks,  contamination,  mechanical  losses  and  calibration  errors  are 
examples  of  the  latter  kinds.  Bias  may  be  both  positive  and  negative  and  several  kinds 
can  exist  concurrently,  so  that  net  bias  is  all  that  can  be  evaluated,  except  under 
special  conditions. 

Bulk  sampling  -  Sampling  of  a  material  that  does  not  consist  of  discrete,  identifiable, 
constant   units,    but   rather   of   arbitrary,    irregular  units. 

Callbrant  -  A  substance  used  to  calibrate  or  to  establish  the  analytical  response  of  a 
measurement  system. 

Calibration-  -  Comparison  of  a  measurement  standard  or  instrument  with  another  standard  or 
instrument  to  report  or  eliminate  by  adjustment  any  variation  (deviation)  in  the  accuracy 
of   the   item  being  compared. 

Cause-Effect  Diagram  -  A  graphical  representation  of  the  causes  that  can  produce  a  specified 
kind  of  error  in  measurement.  A  popular  one  is  the  so-called  fish  bone  diagram,  first 
described   by   Ishikawa,   given   this   name   because   of    its   suggestive  shape. 

Central   Line   -  The   long-term  expected   value   of  a   variable   displayed  on  a   control  chart. 

Certification   -  See  accreditation. 

Certified  Reference  Material  (CRM)  -  A  reference  material  one  or  more  of  whose  property  values 
are  certified  by  a  technically  valid  procedure,  accompanied  by  or  traceable  to  a  certifi- 
cate or   other   documentation  which   is   issued   by   a   certifying  body. 

Certified  Value  -  The  value  that  appears  in  a  certificate  as  the  best  estimate  of  the  value 
for   a   property  of   a  reference  material. 

Chance  Cause  -  A  cause  for  variability  of  a  measurement  process  that  occurs  unpredictably,  for 
unknown  reasons,    and   believed   to   happen   by   chance,  alone. 

Check  Standard  -  In  physical  calibration,  an  artifact  measured  periodically,  the  results  of 
which  typically   are   plotted  on  a   control   chart   to   evaluate   the  measurement  process. 


*For  other  definitions,  and  particularly  those  related  to  industrial  quality  assurance,  see: 
"Quality  Systems  Terminology",  ANSI/ASQC  Standard  A  3  - 1 9  7  8,  American  National  Standards 
Institute,    1430   Broadway,    New   York,    NY  10018. 
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Coefficient      of      Variation      -      The    standard    deviation    divided    by    the    value    of    the  parameter 
measured  . 


Common  Cause  -  A  cause  of  variability  of  a  measurement  process,  inherent  in  and  common  to  the 
process   itself,    as   contrasted   to   a  special   cause  (defined). 

Comparative  Method  -  A  method  which  is  based  on  the  i nt er com  par i son  of  the  sample  with  a 
chemical  standard. 

Confidence  Interval  -  That  range  of  values,  calculated  from  an  estimate  of  the  mean  and  the 
standard  deviation,  which  is  expected  to  include  the  population  mean  with  a  stated  level 
of  confidence.  Confidence  intervals  in  the  same  context  also  may  be  calculated  for 
standard   deviations,    lines,    slopes,  points. 

Control  Limit  -  The  limits  shown  on  a  control  chart  beyond  which  it  is  highly  improbable  that 
a   point   could   lie  while   the  system   remains   in   a   state   of   statistical  control. 

Control  Chart  -  A  graphical  plot  of  test  results  with  respect  to  time  or  sequence  of  measure- 
ment together  with  limits  within  which  they  are  expected  to  lie  when  the  system  is  in  a 
state  of   statistical  control. 

Control  Sample  -  A  material  of  known  composition  that  is  analyzed  concurrently  with  test 
samples   to   evaluate   a  measurement   process    (see   also  Check  Standard). 

Composite  Sample  -  A  sample  composed  of  two  or  more  increments  selected  to  represent  a 
population  of  interest. 

Cross  Sensitivity  -  A  quantitative  measure  of  the  response  obtained  for  an  undesired  con- 
stituent   ( i nt erf  err ent  )    as   compared   to   that   for   a   constituent   of  interest. 

Detection  Limit  -  The  smallest  concentration/amount  of  some  component  of  interest  that  can  be 
measured   by   a   single  measurement   with  a   stated   level   of  confidence. 

Double  Blind  -  A  sample,  known  by  the  submitter  but  submitted  to  an  analyst  in  such  a  way  that 
neither   its   composition   nor   its   identification   as   a   check   sample   are   known   to   the  latter. 

Duplicate  Measurement  -  A  second  measurement  made  on  the  same  (or  identical)  sample  of 
material   to   assist   in   the   evaluation   of  measurement  variance. 

Duplicate  Sample  -  A  second  sample  randomly  selected  from  a  population  of  interest  (see  also 
split   sample)    to   assist   in   the   evaluation   of   sample  variance. 

Education  -  Disciplining  the  mind  through  instruction  or  study.  Education  is  general  and 
prepares   the   mind   to  react   to  a   variety   of  situations. 

Error  -  Difference  betwen  the  true  or  expected  value  and  the  measured  value  of  a  quantity  or 
par amet  er  . 

Figure  of  Merit  -  A  performance  characteristic  of  a  method  believed  to  be  useful  when  deciding 
its  applicability  for  a  specific  measurement  situation.  Typical  figures  of  merit 
include:      selectivity;    sensitivity;    detection   limit;    precision;  bias. 

Good  Laboratory  Practice  (GLP)  -  An  acceptable  way  to  perform  some  basic  operation  or  activity 
in  a  laboratory,  that  is  known  or  believed  to  influence  the  quality  of  its  outputs.  GLP's 
ordinarily   are   essentially   independent   of   the  measurement   techniques  used. 

Good  Measurement  Practice  (GMP)  -  An  acceptable  way  to  perform  some  operation  associated  with 
a  specific  measurement  technique,  and  which  is  known  or  believed  to  influence  the  quality 
of   the  measurement. 

Gross  Sample  (also  called  bulk  sample,  lot  sample)  -  One  or  more  increments  of  material  taken 
from  a   larger   quantity    (lot)    of   material   for   assay   or   record  purposes. 

Homogeneity  -  The  degree  to  which  a  property  or  substance  is  randomly  distributed  throughout  a 
material.  Homogeneity  depends  on  the  size  of  the  subsample  under  consideration.  Thus  a 
mixture  of  two  minerals  may  be  i nhomogeneous  at  the  molecular  or  atomic  level,  but  homo- 
geneous  at   the   particulate  level. 

Incrememt  -  An  individual  portion  of  material  collected  by  a  single  operation  of  a  sampling 
device,  from  parts  of  a  lot  separated  in  time  or  space.  Increments  may  be  either  tested 
individually  or   combined    (composited)   and   tested   as   a  unit. 

Individuals  -   Conceivable   constituent   parts   of   a  population. 

Informational  Value  -  Value  of  a  property,  not  certified  but  provided  because  it  is  believed 
to   be  reliable   and   to   provide   information   important   to   the   certified  material. 
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Intercalibration  -  The  process,  procedures,  and  activities  used  to  ensure  that  the  several 
laboratories  engaged  in  a  monitoring  program  can  produce  compatible  data.  When  compati- 
ble data  outputs  are  achieved  and  this  situation  is  maintained,  the  laboratories  can  be 
said   to   be   i n t er cal i br at ed . 

Laboratory  Sample  -  A  sample,  intended  for  testing  or  analysis,  prepared  from  a  gross  sample 
or  otherwise  obtained.  The  laboratory  sample  must  retain  the  composition  of  the  gross 
sample.  Often  reduction  in  particle  size  is  necessary  in  the  course  of  reducing  the 
quanti  ty . 

Limiting  Mean  -  The  value  approached  by  the  average  as  the  number  of  measurements,  made  by  a 
stable  measurement   process,    increases  indefinitely. 

Limit  of  Linearity  (LOL)  -  The  upper  limit  of  concentration  or  amount  of  substance  for  which 
incremental   additions   produce   constant   increments   of  response. 

Limit  of  Quantitation  (LOQ)  -  The  lower  limit  of  concentration  or  amount  of  substance  that 
must  be  present  before  a  method  is  considered  to  provide  quantitative  results.  By 
convention  LOQ  =  10so*  wnere  s0  is  tne  estimate  of  the  standard  deviation  at  the  lowest 
level   of  measurement. 

Lot   -  A  quantity  of   bulk  material   of   similar   composition  whose   properties   are   under  study. 

Method   -  An   assemblage   of   measurement   techniques   and   the   order   in  which  they   are  used. 

Outlier  -  A  value  which  appears  to  deviate  markedly  from  that  for  other  members  of  the  sample 
in  which   it  occurs. 


Pareto    Analysis    -    A    statistical     approach    to    ranking    assignable    causes       according    to  the 
frequency   of  occurrence. 

Performance   Audit   -   A   process    to   evaluate   the    proficiency   of   an   ana  1 y s t / 1 abor at  or y    by  evalua- 
tion of   the  results  obtained  on   known  test  materials. 


Population  -  A  generic  term  denoting  any  finite  or  infinite  collection  of  individual  things, 
objects,  or  events;  in  the  broadest  concept,  an  aggregate  determined  by  some  property 
that   distinguishes   things   that   do   and   do   not  belong. 

Precision  -  The  degree  of  mutual  agreement  characteristic  of  independent  measurements  as  the 
result  of  repeated  application  of  the  process  under  specified  conditions.  It  is 
concerned   with   the   closeness   together   of  results. 

Primary  Standard  -  A  substance  or  artifact,  the  value  of  which  can  be  accepted  (within 
specific  limits)  without  question  when  used  to  establish  the  value  of  the  same  or  related 
property  of  another  material.  Note  that  the  primary  standard  for  one  user  may  have  been 
a  secondary  standard   of  another. 

Probability  -  The  likelihood  of  the  occurrence  of  any  particular  form  of  an  event,  estimated 
as  the  ratio  of  the  number  of  ways  or  times  that  the  event  may  occur  in  that  form  to  the 
total   number   of   ways   that    it   could  occur    in   any  form. 

Procedure  -  A  set  of  systematic  instructions  for  using  a  method  of  measurement  or  of  sampling 
or   of   the   steps   or   operations   associated  with  such. 

Protocol  -  A  procedure  specified  to  be  used  when  performing  a  measurement  or  related 
operation,    as   a  condition   to  obtain  results    that   could   be   acceptable   to   the  specifier. 

Protocol  for  a  Specific  Purpose  (PSP)  -  Detailed  instructions  for  the  performance  of  all 
aspects   of   a  measurement   program.     This   is   sometimes   called   a   project   QA  plan. 

Quality  -  An  estimation  of  acceptability  or  suitability  for  a  given  purpose  of  an  object, 
item,    tangible,    or   intangible  thing. 

Quality  Assessment  -  The  overall  system  of  activities  whose  purpose  is  to  provide  assurance 
that  the  quality  control  activities  are  being  done  effectively.  It  involves  a  continuing 
evaluation  of  performance  of  the  production  system  and  the  quality  of  the  products 
prod  uced . 

Quality  Assurance  -  A  system  of  activities  whose  purpose  is  to  provide  to  the  producer  or  user 
of  a  product  or  a  service  the  assurance  that  it  meets  defined  standards  of  quality.  It 
consists  of  two  separate  but  related  activities,  quality  control  and  quality  assessment 
(defined)  . 

Quality  Circle  -  A  small  group  of  individuals  with  related  interests  that  meets  at  regular 
intervals  to  consider  problems  or  other  matters  related  to  the  quality  of  outputs  of  a 
process   and   to  the   correction  of   problems   or   to   the   improvement   of  quality. 
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Quality  Control  -  The  overall  system  of  activities  whose  purpose  is  to  control  the  quality  of 
a  product  or  service  so  that  it  meets  the  needs  of  users.  The  aim  is  to  provide  quality 
that    is   satisfactory,    adequate,   dependable,    and  economic. 

Random  Sample   -  A   sample   selected   from  a   population,    using  a  randomization  process. 

Reduction  -  The   process   of   preparing  one  or   more   subsamples   from   a  sample. 

Reference  Material  (RM)  -  A  material  or  substance  one  or  more  properties  of  which  are 
sufficiently  well  established  to  be  used  for  the  calibration  of  an  apparatus,  the 
assessment   of   a  measurement  method,    or   for   the   assignment   of   values   to  materials. 

Reference  Method  -  A  method  which  has  been  specified  as  capable,  by  virtue  of  recognized 
accuracy,   of   providing   primary  reference  data. 

Relative   Standard   Deviation   -  The   coefficient   of   variation,    expressed   as   a  percentage. 

Replicate  -  A  counterpart  of  another,  usually  referring  to  an  analytical  sample  or  a  measure- 
ment. It  is  the  general  case  for  which  duplicate  is  the  special  case  consisting  of  two 
samples   or  measurements. 

Routine  Method   -   A  method   used   in  recurring   analytical  problems. 

Sample  -  A  portion  of  a  population  or  lot.  It  may  consist  of  an  individual  or  groups  of  indi- 
viduals. It  may  refer  to  objects,  materials,  or  to  measurements,  conceivable  as  part  of 
a   larger   group   that   could   have   been  considered. 

Secondary  Standard  -  A  standard  whose  value  is  based  upon  comparison  with  some  primary 
standard.  Note  that  a  secondary  standard,  once  its  value  is  established,  can  become  a 
primary   standard   for   some   other  user. 

Segment   -  A   specifically   demarked   portion   of   a   lot,    either   actual   or  hypothetical. 

Selectivity  -  The  ability  of  methodology  or  instrumentation  to  respond  to  a  desired  substance 
or   constituent   and   not   to   others.      It   is   sometimes   quantified   as   cross   sensitivity,  which 

see , 

Sensitivity  -  Capability  of  methodology  or  instrumentation  to  discriminate  between  samples 
having  differing   concentrations   or'  containing   differing   amounts   of   an  analate. 

Significant  Figure  -  A  figure(s)  that  remains  to  a  number  or  decimal  after  the  ciphers  to  the 
right   or   left   are  cancelled. 

Special  Cause  -  A  cause  of  variance  or  bias  that  is  external  (not  inherent)  to  the  measurement 
system. 

Split  Sample  -  A  replicate  portion  or  sub-sample  of  a  total  sample  obtained  in  such  a  manner 
that    it    is    not    believed    to    differ    significantly    from    other    portions    of    the    same  sample. 

Standard  -  A  substance  or  material,  the  properties  of  which  are  believed  to  be  known  with  suf- 
ficient accuracy  to  permit  its  use  to  evaluate  the  same  property  of  another.  In  chemical 
measurements,  it  often  describes  a  solution  or  substance,  commonly  prepared  by  the 
analyst,  to  establish  a  calibration  curve  or  the  analytical  response  function  of  an 
i  ns  t r  umen t . 

Standard  Addition  -  A  method  in  which  small  increments  of  a  substance  under  measurement  are 
added  to  a  sample  under  test  to  establish  a  response  function  or,  by  extrapolation,  to 
detemine   the   amount   of   a   constituent   orginally   present    in   the   test  sample. 

Standardization  -  (In  analytical  chemistry)  the  assignment  of  a  compositional  value  to  one 
standard  on   the   basis   of   another  standard. 

Standard  Method  -  A  method  (or  procedure)  of  test  developed  by  a  s t an dar ds - wr i t i ng 
organization,  based  on  consensus  opinion  or  other  criteria,  and  often  evaluated  for  its 
reliability   by   a   collaborative   testing  procedure. 

Standard  Operations  Procedure  (SOP)  -  A  procedure  adopted  for  repetitive  use  when  performing  a 
specific  measurement  or  sampling  operation.  It  may  be  a  standard  method  or  one  developed 
by   the   user  . 

Standard  Reference  Material  -  A  reference  material  distributed  and  certified  by  the  National 
Bureau  of  Standards. 

Strata  -  Segments   of   a   lot   that   may   vary  with  respect   to   the   property   under  study. 
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Subsample  -  A  portion  taken  from  a  sample.  A  laboratory  sample  may  be  a  subsample  of  a  gross 
sample;   similarly,   a  test  portion  may  be  a  subsample  of  a  laboratory  sample. 

Technique  -  A  physical  or  chemical  principle  utilized  separately  or  in  combination  with  other 
techniques  to  determine  the  composition   (analysis)   of  materials. 

Test  Portion   (also  called  specimen,   test  specimen,   test  unit,   aliquot)  -  That   quantity  of 

a  material  of  proper  size  for  measurement  of  the  property  of  interest.  Test  portions  may 
be  taken  from  the  gross  sample  directly,  but  often  preliminary  operations,  such  as  mixing 
or  further  reduction  in  particle  size,   are  necessary. 

Tolerance  Interval  -  That  range  of  values,  calculated  from  an  estimate  of  the  mean  and  the 
standard  derivation,  within  which  a  specified  percentage  of  individual  values  of  popula- 
tion  (measurements  or  sample)  are  expected  to  lie  with  a  stated  level  of  confidence 

Traceability  -  The  ability  to  trace  the  source  of  uncertainty  of  a  measurement  or  a  measured 
value . 

Training  -  Formal  or  informal  instruction  designed  to  provide  competence  of  a  specific  nature. 


Uncertainty  -  The  range  of  values  within  which  the  true  value  is  estimated  to  lie.  It  is  a 
best  estimate  of  possible  inaccuracy  due  to  both  random  and  systematic  error. 

Validation  -  The  process  by  which  a  sample,  measurement  method,  or  a  piece  of  data  is  deemed 
to  be  useful  for  a  specified  purpose. 

Variance  -  The  value  approached  by  the  average  of  the  sum  of  the  squares  of  deviations  of  in- 
dividual measurements  from  the  limiting  mean.  Mathematically,   it  may  be  expressed  as 

l(xt    -  m)2  2 

- — 1   ■»  oc  as  n  +  «• 

n 

Ordinarily  it  cannot  be  known  but  only  its  estimate,  s2,  which  is  calculated  by  the 
expression 

32  .  LL*j  -  x)2 

3  n-1 

Warning  Limits  -  The  limits  shown  on  a  control  chart  within  which  most  of  the  test  results  are 
expected  to  lie  (within  a  95%  probability)  while  the  system  remains  in  a  state  of  statis- 
tical control. 

Youden  Plot  -  A  graphical  presentation  of  data,  recommended  first  by  W.  J.  Youden,  in  which 
the  result(s)  obtained  by  a  laboratory  on  one  sample  is  plotted  with  respect  to  the 
result(s).  it  obtained  on  a  similar  sample.  It  helps  in  deciding  whether  discrepant 
results  are  due  to  random  or  systematic  error. 


15 


APPENDIX   B.      CONVERSION   FACTORS   AND  TABLES 


B.1     Conversion  Factors 


In  the  metric  system  of  weights  and  measures,  designations  of  multiples  and  subdivisions 
of  any  unit  may  be  arrived  at  by  combining  with  the  name  of  the  unit,  the  prefixes  deka , 
hecto ,  and  kilo ,  meaning,  respectively,  10,  100,  and  1  000,  and  dec  1  ,  centl  ,  and  mil  1 T, 
meaning,  respectively,  one-tenth,  one-hundredth,  and  one- thousandth .  In  some  of  the  following 
metric  tables,  some  such  multiples  and  subdivisions  have  not  been  included  for  the  reason  that 
these  have  little,    if  any  actual  usage. 


In  certain  cases,  particularly  in  scientific  usage,  it  becomes  convenient  to  provide  for 
multiples  larger  than  1  000  and  for  subdivisions  smaller  than  one- thousandth .  Accordingly, 
the  following  prefixes  have  been  introduced  and   these  are  new  generally  recognized: 


exa ,  ( E )  mean  i  ng 
peta  , 
ter a , 
giga  , 
mega  , 
kilo  , 
hecto 
deka  , 


(P)  , 
(T)  , 
(G)  , 
(m)  , 
(k)  , 
,  (h) 
(da) 


mean i ng   10  5 
meaning  loj2 
meaning  10;: 
meaning  10 
meaning  10^ 
meaning   1 0  ^ 
meaning  10 


deci  , 
cent  i 
mill! 
mi  cro 


(d)  , 
(c) 
(m) 
(u) 


meaning  10" 
mean  i  n  i  ng 


g  10 
10  3 


meaning 
meaning  10" 


nano ,    (n),   meaning  10 


pico  , 
f  emto 
atto , 


(P) 

(f ) 
(a) , 


meaning  10 

meaning  10 
meaning  10~ 
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Thus  a  kilometer  is   1    000  meters  and  a  millimeter   is  0.001  meter. 


LINEAR  MEASURE 


1 0  mi  11 imeters 

10  centimeters 

10  decimeters 

10  meters 

1 0  dekameters 

10  hectometers 


(mm ) 


1  centimeter  (cm) 

1  decimeter   (dm)  »  100  millimeters 

1  meter   (m)   -1   000  millimeters 

1  dekameter  (dam) 

1  hectometer   (hm)  -  100  meters 

1  kilometer    (km)   -  1    000  meters 


100  square  millimeters  (mm   )     -  1 

100  square  centimeters  »  1 

100  square  decimeters  =  1 

100  square  meters  =  1 

100  square  dekameters  »  1 

100  square  hectometers  -  1 


AREA  MEASURE 

square  centimeter  (cm2) 

square  decimeter  (dm2) 

square  meter  (m2) 

square  dekameter   (dam  )   -  1  are 

square  hectometer     (hm2)   -  1   hectare  (ha) 

square  kilometers   (km  ) 


FLUID  VOLUME  MEASURE 


10  milliliters  (mL)                      -  1 

10  centiliters  »  1 

10  deciliters  -  1 

10  liters  -  1 

10  dekaliters  »  1 

10  hectoliters  »  1 


centiliter  (cL) 

deciliter   (dL)   -  100  milliliters 
liter  -  1    000  milliliters 
dekaliter  (daL) 
hectoliter   (hL)  -  100  liters 
kiloliter   (kL)   -   1    000  liters 


1   000  cubic  millimeters   (mm-*)   -  1 

1    000  cubic  centimeters  »  1 

1    000  cubic  decimeters  =  1 

-  1 


SOLID  VOLUME  MEASURE 

cubic  centimeter  (cm^) 

cubic  decimeter  (dm') 

000  000  cubic  millimeters 

cubic  meter  (m^) 

000  000  cubic  centimeters 

000  000  000  cubic  millimeters 


WEIGHT 


10  milligrams 

10  centigrams 

10  decigrams 

1  0  grams 

10  dekagrams 

10  hectograms 
1    000  kilograms 


(mg) 


centigram  (eg) 

decigram   (dg)   =   100  milligrams 
gram   (g)   ■   1    000  milligrams 
dekagram  (dag) 
hectogram   (hg)   »   100  grams 
kilogram   (kg)   -   1   000  grams 
megagram   (Mg)   or   1   metric  ton  (t) 


H6 


Table  B.2. 

Use  of 

Range  to 

Estimate 

Variability 

Number  of 

Number  of 

Measurements  in 

a  Set 

Sets,  k 

2 

3 

4 

5 

6 

3 

4 

1  .23 

1  .77 

2.12 

2. 

38 

2.58 

V 

2.83 

5.86 

8.44 

1  1  . 

1 

13.6 

5 

4 

1.19 

1  .74 

2.10 

2  . 

36 

2.56 

V 

4.59 

9.31 

13.9 

1  8. 

4 

22.  6 

1  0 

d2 

1.16 

1  .72 

2.  08 

2  . 

34 

2.55 

8.  99 

18.4 

27.6 

36. 

5 

44.9 

20 

d* 

1.14 

1  .70 

2  .07 

2. 

35 

2.54 

17.8 

36.5 

55.0 

72. 

7 

89.6 

d2 

1.13 

1  .69 

2  .06 

2. 

33 

2.53 

R  =   mean   of   k  sets   of   replicate  measurements 

v  =  degrees  of  freedom  in  estimate  of  standard  deivation 

For   k   >   20,    v   =   0.876  k 


Adapted   from  Lloyd   S.    Nelson,    J.    Qual.    Tech.   7(1)    January  (1975) 
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Table   B.3     Critical   Values   for   the   F  Test 


Critical  values  for  a  2-tailed  test  of  equality  of  standard  deviation 
estimates  at   5%  level   of  significance 


F975  ("i.  "2) 

n,  =  degrees  of  freedom  for  numerator 


"  \ 

J 

2 

4 

7 

( 

,0 

1, 

1  j 

20  " 

24 

30 

40 

6( 

120 

1 

647.8 

799.5 

864.2 

899.6 

921.8 

937.1 

948.2 

956.7 

963.3 

968.C 

976.7 

984.9 

993.1 

997.2 

1001 

1006 

,_. 

1010 

1014 

1018 

2 

38.51 

39.00 

39.17 

39.25 

39.30 

39.33 

39.36 

39.37 

39.39 

39.40 

39.41 

39.43 

39.45 

39.46 

39.46 

39.47 

39  4  i 

39.49 

39.50 

' 

17.44 

16.04 

15.44 

15.10 

14.88 

14.73 

14.62 

14.54 

14.4T 

14.42 

14.34 

14.25 

14.17 

14.12 

14.08 

14.04 

13  9) 

13.95 

13.90 

12.22 

10.65 

9.98 

9.60 

9.36 

9.20 

9.07 

8.98 

8.84 

8.75 

8.66 

8.56 

8.51 

8.46 

8.41 

8  3  i 

8.31 

8.26 

S 

10.01 

8.43 

7.76 

7.39 

7.15 

6.98 

6.85 

6.76 

6.68 

6.62 

6.52 

6.43 

6.33 

6.28 

6.23 

6.18 

6  1! 

6.07 

6.02 

8.81 

7.26 

6.60 

6.23 

5.99 

5.82 

5.70 

5.60 

5.52 

5.46 

5.37 

5.27 

5.17 

5.12 

5.07 

5.01 

4  9i 

4.90 

4.85 

7 

8.07 

6.54 

5.89 

5.52 

5.29 

5.12 

4.99 

4.90 

4.82 

4.76 

4.67 

4.57 

4.47 

4.42 

4.36 

4.31 

4  2) 

4.20 

4.14 

* 

6.0( 

5.45 

5.0: 

4. 85 

4.6: 

4.53 

4.43 

4.36 

4.3C 

4.20 

4.10 

4.00 

3.95 

3.89 

3.84 

3  7! 

3.73 

3.67 

721 

4.20 

3.87 

3.77 

3.67 

3.61 

3.51 

3.83 

10 

6.94 

6.46 

4.83 

4.47 

4.24 

4.07 

3.95 

3.85 

3.78 

3.72 

3.62 

3.52 

3.42 

3.37 

3.31 

3.26 

3  2'l 

3.14 

3.08 

6.72 

5.26 

4.63 

4.28 

4.04 

3.76 

3.66 

3.59 

3.53 

3.43 

3.33 

3.23 

3.17 

3.12 

3.06 

3  Oil 

2.94 

2.88 

12 

6.55 

5.10 

4.47 

4.12 

3.89 

3.73 

3.61 

3.51 

3.44 

3.37 

3.28 

3.18 

3.07 

3.02 

2.96 

2.91 

2.8  i 

2.79 

2.72 

6.41 

4.91 

4.3! 

4.0C 

3.  Ti 

3.6C 

3.4* 

3.3S 

3.31 

3.2E 

3.15 

3.05 

2.95 

2.89 

2.84 

2.78 

2  7'! 

2.66 

2.60 

JJ 

3.21 

3.05 

2.84 

2.79 

2.73 

2.67 

2.61 

2.65 

2.49 

1  J 

6.20 

4.77 

4.15 

3.80 

3.58 

3.41 

3.29 

3.20 

3.12 

3.06 

2.96 

2.86 

2.76 

2.76 

2.64 

2.59 

2.52 

2.46 

2.40 

16 

6.12 

4.69 

4.08 

3.73 

3.50 

3.34 

3.22 

3.12 

3.05 

2.99 

2.89 

2.79 

2.68 

2.63 

2.57 

2.51 

2  4r) 

2.38 

2.32 

17 

6.04 

4.62 

4.01 

3.66 

3.44 

3.28 

3.16 

3.06 

2.98 

2.92 

2.82 

2.72 

2.62 

2.56 

2.50 

2.44 

2  38 

2.32 

2.25 

5.98 

4.56 

3.95 

3.61 

3.38 

3.22 

3.10 

3.01 

2.93 

2.87 

2.77 

2.67 

2.56 

2.50 

2.44 

2.38 

2.32 

2.26 

2.19 

19 

5.92 

4.51 

3.90 

3.56 

3.33 

3.17 

3.05 

2.96 

2.88 

2.82 

2.72 

2.62 

2.51 

2.45 

2.39 

2.33 

2.27 

2.20 

2.13 

5.87 

4.46 

3.86 

3.51 

3.13 

3.01 

2.91 

2.84 

2.77 

2.68 

2.57 

2.46 

2.41 

2.35 

2.29 

2.22 

2.16 

2.09 

6.83 

4.42 

3.82 

3.48 

I'll 

3.09 

2.97 

2.87 

2.80 

2.73 

2.64 

2.53 

2.42 

2.37 

2.31 

2.25 

2.1>! 

2.11 

2.04 

6.79 

4.38 

3.78 

3.44 

3.22 

3.05 

2.93 

2.84 

2.76 

2.70 

2.60 

2.50 

2.33 

2.27 

2.21 

2.1< 

2.08 

2.00 

5.75 

4.3E 

3.75 

3.41 

3.18 

3.02 

2.90 

2.81 

2.73 

2.67 

2.57 

2.47 

2.3t 

2.30 

2.24 

2.18 

2.1 

2.04 

1.97 

5.72 

3.72 

3.38 

3.15 

2.99 

2.87 

2.78 

2.70 

2.64 

2.54 

2.44 

2.27 

2.21 

2.15 

2.): 

2.01 

1.94 

5.69 

3.69 

3.35 

3.13 

2.97 

2.85 

2.75 

2.68 

2.61 

2.51 

2.41 

2.30 

2.24 

2.18 

2.12 

2.) 

1.98 

1.91 

5.66 

it'i 

3.67 

3.33 

3.10 

2.94 

2.82 

2.73 

2.65 

2.59 

2.49 

2.39 

2.28 

2.22 

2.16 

2.09 

2. ): 

1.95 

1.88 

5.63 

4.24 

3.65 

3.31 

3.08 

2.92 

2.80 

2.71 

2.63 

2.57 

2.47 

2.36 

2.25 

2.19 

2.13 

2.07 

2.  )i 

1.93 

1.85 

5.61 

4.22 

3.63 

3.29 

3.06 

2.90 

2.78 

2.69 

2.61 

2.55 

2.45 

2.34 

2.23 

2.17 

2.11 

2.05 

1.91 

1.83 

6.59 

4.20 

3.61 

3.27 

3.04 

2.88 

2.76 

2.67 

2.59 

2.53 

2.43 

2.32 

2.21 

2.15 

2.09 

2.03 

1.89 

1.81 

5.57 

4.18 

3.59 

3.25 

3.03 

2.87 

2.75 

2.65 

2.57 

2.51 

2.41 

2.31 

2.20 

2.14 

2.07 

2.01 

1.87 

1.79 

5.42 

4.05 

3.46 

3.13 

2.90 

2.74 

2.62 

2.53 

2.45 

2.39 

2.29 

2.18 

2.07 

2.01 

1.94 

1.88 

l.n 

1.64 

5.29 

3.93 

3.34 

3.01 

2.79 

2.63 

2.51 

2.41 

2.33 

2.27 

2.17 

2.06 

1.94 

1.88 

1.82 

1.74 

1.58 

1.48 

110 

5.15 

3.80 

3.23 

2.89 

2.67 

2.52 

2.39 

2.30 

2.22 

2.16 

2.05 

1.94 

1.82 

1.76 

1.69 

1.61 

1.43 

1.31 

5.02 

3.69 

3.12 

2.79 

2.57 

2.41 

2.29 

2.19 

2.11 

2.05 

1.94 

1.83 

1.71 

1.64 

1.57 

1.48 

1.27 

1.00 

Excerpted  from  "Experimental  Statistics"  (19)  Table  A. 5  which  may  be  consulted  for  more 
extensive  listings. 
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Table  B .  1)     Factors   for  Computing  Two-Sided  Confidence  Limits  for  o 


Dtgr**i 

a  - 

.03 

a  » 

.01 

a  " 

.001 

°d 
r«  mm 

■o 

>L 

Bu 

BL 

Bu 

Bt 

1 

17.79 

.3576 

86.31 

.2969 

844.4 

.2480 

2 

4.859 

.4581 

10.70 

.3879 

33.29 

.3291 

3 

3.183 

.5178 

5.449 

.4453 

11.65 

.3824 

2.567 

.5590 

3.892 

.4865 

6.938 

.4218 

s 

2.248 

.5899 

3.175 

.5182 

5.085 

.4529 

6 

2.052 

.6143 

2.764 

.5437 

4.128 

.4784 

7 

1.918 

.6344 

2.498 

.5650 

3.551 

.5000 

t 

1.820 

.6513 

2.311 

.5830 

3.167 

.5186 

9 

1.746 

.6657 

2.173 

.5987 

2.894 

.5348 

10 

1.686 

.6784 

2.065 

.6125 

2.689 

.5492 

11 

1.638 

.6896 

1.980 

.6248 

2.530 

.5621 

12 

1.598 

.6995 

1.909 

.6358 

2.402 

.5738 

13 

1.564 

.7084 

1.851 

.6458 

2.298 

.5845 

1.534 

.7166 

1.801 

.6549 

2.210 

.5942 

13 

1.509 

.7240 

1.758 

.6632 

2.136 

.6032 

16 

1.486 

.7308 

1.721 

.6710 

2.073 

.6116 

17 

1.466 

.7372 

1.688 

.6781 

2.017 

.6193 

It 

1.448 

.7430 

1.658 

.6848 

1.968 

.6266 

19 

1.432 

.7484 

1.632 

.6909 

1.925 

.6333 

20 

1.417 

.7535 

1.609 

.6968 

1.886 

.6397 

21 

1.404 

.7582 

1.587 

.7022 

1.851 

.6457 

22 

1 .391 

.7627 

1 .568 

.7074 

1.820 

.6514 

23 

1.380 

.7669 

1.550 

.7122 

1.791 

.6568 

24 

1.370 

.7709 

1.533 

.7169 

1.765 

.6619 

23 

1.360 

.7747 

1.518 

.7212 

1.741 

.6668 

26 

1.351 

.7783 

1.504 

.7253 

1.719 

.6713 

27 

1.343 

.7817 

1.491 

.7293 

1.698 

.6758 

21 

1.335 

.7849 

1.479 

.7331 

1.679 

6800 

29 

1.327 

.7880 

1.467 

.7367 

1.661 

.6841 

30 

1.321 

.7909 

1.457 

.7401 

1 .645 

.6880 

31 

1.314 

.7937 

1.447 

.7434 

1.629 

.6917 

32 

1 .308 

.7964 

1.437 

.7467 

1 .615 

.6953 

33 

1.302 

.7990 

1.428 

.7497 

1.601 

.6987 

34 

1.296 

.8015 

1.420 

.7526 

1.588 

.7020 

33 

1.291 

.8039 

1.412 

.7554 

1.576 

.7052 

36 

1.286 

.8062 

1.404 

.7582 

1.564 

.7083 

37 

1.281 

.8085 

1.397 

.7608 

1.553 

.7113 

38 

1.277 

.8106 

1.390 

.7633 

1.543 

.7141 

39 

1.272 

.8126 

1.383 

.7658 

1.533 

.7169 

40 

1 .268 

.8146 

1 .377 

.7681 

1 . 523 

.  7197 

41 

1.264 

.8166 

1.371 

.7705 

1.515 

.7223 

42 

1.260 

.8184 

1.365 

.7727 

1.506 

.7248 

43 

1.257 

.8202 

1.360 

.7748 

1.498 

.7273 

44 

1.253 

.8220 

1.355 

.7769 

1.490 

.7297 

43 

1.249 

.8237 

1.349 

.7789 

1.482 

.7320 

46 

1.246 

.8253 

1.345 

.7809 

1.475 

.7342 

47 

1.243 

.8269 

1.340 

.7828 

1.468 

.7364 

1. 2i0 

1.335 
1.331 

.  /Sb4 

1.455 

.73i6 
.7407 

1.234 

]8314 

1.327 

.7882 

1.449 

.7427 

Excerpted  from  "Experimental  Statistics"  (19)  Table  A. 20  which  may  be  consulted  for  more 
extensive  listings. 


49 


Table   B.5     Percentiles   of   the   t  Distribution 


Confidence  level 

of  2-sided  interval  20  40  60  80  90  95  98  99 


975  c.99  1.995 


1 

.  325 

.  727 

1.376 

3.078 

6 

31  4 

1  2 

706 

31  . 

821 

63. 657 

2 

.  289 

.61  7 

1.061 

1  .  886 

2 

920 

4 

303 

6 . 

965 

9.  925 

3 

.277 

.584 

.  978 

1  .  638 

2 

353 

3 

1  82 

4 . 

541 

5.841 

4 

.  271 

.  569 

.  941 

1.533 

2 

1  32 

2 

776 

3 . 

747 

4.  604 

5 

.267 

.559 

.  920 

1  .476 

2 

01  5 

2 

571 

3. 

365 

4.032 

6 

.  265 

.  553 

.90  6 

1.440 

943 

2 

4  4  7 

3  • 

1  43 

3.707 

7 

.263 

.549 

.896 

1.415 

1 

895 

2 

365 

2 . 

998 

3.499 

8 

.  262 

.54  6 

.889 

1.397 

860 

2 

306 

2  . 

896 

3  .355 

9 

.261 

.543 

.  883 

1  .383 

] 

833 

2 

262 

2. 

821 

3.250 

1  0 

.260 

.  542 

.  879 

1  .372 

1 

81  2 

2 

228 

2. 

764 

3.169 

1  1 

.2  60 

.54  0 

.876 

1.363 

1 

796 

2 

201 

2  . 

71  8 

3.106 

1  2 

.259 

.539 

.  873 

1.356 

1 

782 

2 

1  79 

2  . 

681 

3.  055 

1  3 

.259 

.538 

.870 

1.350 

' 

771 

2 

1  60 

2  . 

650 

3.012 

1  4 

.258 

.537 

.86  8 

1.345 

761 

2 

1  45 

2  . 

62  4 

2.977 

1  5 

.258 

.536 

.  866 

1-341 

\ 

753 

2 

1  31 

2. 

602 

2.947 

1  6 

.258 

.53  5 

.865 

1.337 

1 

7  4  6 

2 

1  2  0 

2  . 

583 

2.921 

1  7 

.257 

.534 

.863 

1.333 

1 

7  4  0 

2 

1  1  0 

2 . 

567 

2.898 

1  8 

.257 

.  534 

.862 

1.330 

1 

73  4 

2 

1  01 

2  . 

552 

2.878 

1  9 

.257 

.533 

.861 

1  .  328 

729 

2 

093 

2  . 

539 

2.861 

20 

.  257 

.533 

.860 

1  .  325 

725 

2 

086 

2. 

528 

2.845 

21 

.  257 

.532 

.  859 

1  .  323 

1 

7  2  1 

2 

080 

2  . 

51  8 

2.831 

22 

.256 

.  532 

.858 

1  .  321 

■  4- 

71  7 

2 

074 

2  . 

508 

2.819 

23 

.  256 

.  532 

.858 

1.319 

1 

71  4 

2 

069 

2  . 

500 

2.807 

2  4 

.256 

•  53 1 

.  857 

1.318 

7  1  1 

2 

0  64 

2 . 

4  92 

2.797 

25 

.256 

.531 

.  856 

1  .316 

708 

2 

060 

2. 

485 

2.787 

26 

.256 

.531 

.856 

1  .31  5 

706 

2 

056 

2  . 

479 

2.779 

27 

.256 

.531 

.855 

1.314 

703 

2 

052 

2. 

473 

2.771 

28 

.256 

.530 

.855 

1.313 

701 

2 

048 

2. 

467 

2.763 

29 

.256 

.530 

.  854 

1  .31  1 

699 

2 

045 

2. 

462 

2.  756 

30 

.256 

.530 

.854 

1  .310 

697 

2 

042 

2. 

457 

2.750 

HO 

.255 

.  529 

.851 

1  .303 

684 

2 

021 

2. 

423 

2.704 

60 

.254 

.  527 

.  848 

1  .  296 

671 

2 

000 

2. 

390 

2.660 

20 

.254 

.526 

.  845 

1  .289 

658 

1 

980 

2. 

358 

2.617 

.253 

.  524 

.  842 

1  .  282 

645 

1 

960 

2  . 

326 

2.576 

Excerpted  from  "Experimental  Statistics"  (19)  Table  A-4,  which  may  be  consulted  for  more 
extensive  listings. 


50 


Table  B.6     Factors  for  Two-Sided  Tolerance  Limits  for  Normal  Distributions 


y  =  0.95 

7  =  0.99 

H  \ 

0.75 

0.90 

0.95 

0.99 

0.999 

0.75 

0.90 

0.95 

0.99 

0.999 

2 

22.858 

32.019 

37.674 

48.430 

60.573 

114.363 

160 . 193 

188.491 

242.300 

303.054 

3 

5.922 

8.380 

9.916 

12.861 

16.208 

13.378 

18.930 

22.401 

29.055 

36.616 

4 

3.779 

5.369 

6.370 

8.299 

10.502 

6.614 

9.398 

11.150 

14.527 

18.383 

5 

3.002 

4.275 

5.079 

6.634 

8.415 

4.643 

6.612 

7.855 

10.260 

13.015 

6 

2.604 

3.712 

4.414 

5.775 

7.337 

3.743 

5.337 

6.345 

8.301 

10.548 

7 

2.361 

3.369 

4.007 

5.248 

6.676 

3.233 

4.613 

5.488 

7.187 

9.142 

• 

2.197 

3.136 

3.732 

4.891 

6.226 

2.905 

4.147 

4.936 

6.468 

8.234 

9 

2.078 

2.967 

3.532 

4.631 

5.899 

2.677 

3.822 

4.550 

5.966 

7.600 

10 

1.987 

2.839 

3.379 

4.433 

5.649 

2.508 

3.582 

4.265 

5.594 

7.129 

11 

1.916 

2.737 

3.259 

4.277 

5.452 

2.378 

3.397 

4.045 

5.308 

6.766 

12 

1.858 

2.655 

3.162 

4.150 

5.291 

2.274 

3.250 

3.870 

5.079 

6.477 

13 

1.810 

2.587 

3.081 

4.044 

5.158 

2.190 

3. 

3.727 

4.893 

6.240 

14 

1.770 

2.529 

3.012 

3.955 

5.045 

2.120 

3.029 

3.608 

4.737 

6.043 

15 

1.735 

2.480 

2.954 

3.878 

4.949 

2.060 

2.945 

3.507 

4.605 

5.876 

16 

1.705 

2.437 

2.903 

3.812 

4.865 

2.009 

2.872 

3.421 

4.492 

5.732 

17 

1.679 

2.400 

2.858 

3.754 

4.791 

1.965 

2.808 

3.345 

4.393 

5.607 

IS 

1.655 

2.366 

2.819 

3.702 

4.725 

1.926 

2.753 

3.279 

4.307 

5.497 

19 

1.635 

2.337 

2.784 

3.656 

4.667 

1.891 

2.703 

3.221 

4.230 

5.399 

20 

1.616 

2.310 

2.752 

3.615 

4.614 

1.860 

2.659 

3.168 

4.161 

5.312 

21 

1.599 

2.286 

2.723 

3.577 

4.567 

1.833 

2.620 

3.121 

4.100 

5.234 

22 

1.584 

2.264 

2.697 

3.543 

4.523 

1.808 

2.584 

3.078 

4.044 

5.163 

23 

1.570 

2.244 

2.673 

3.512 

4.484 

1.785 

2.551 

3.040 

3.993 

5.098 

24 

1.557 

2.225 

2.651 

3.483 

4.447 

1.764 

2.522 

3.004 

3.947 

5.039 

25 

1.545 

2.208 

2.631 

3.457 

4.413 

1.745 

2.494 

2.972 

3.904 

4.985 

26 

1.534 

2.193 

2.612 

3.432 

4.382 

1.727 

2.469 

2.941 

3.865 

4.935 

27 

1.523 

2.178 

2.595 

3.409 

4.353 

1.711 

2.446 

2.914 

3.828 

4.888 

p  -  proportion  of  population  covered 
Y  »  confidence  level 

n  m  number  of   individuals   (measurements,   samples)   used  to  compute  x  and  s 

Excerpted  from  "Experimental  Statistics"  (19)  Table  A. 6  which  may  be  consulted  for  more 
extensive  listings. 
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Table  B.7     Short  Table  of  Random  Numbers 


77 

27 

86 

26 

21 

89 

91 

71 

42 

64 

64 

58 

74 

44 

19 

!l5 

55 

33 

90 

44 

27 

22 

07 

62 

17 

34 

39 

80 

62 

24 

33 

81 

67 

28 

11 

34 

79 

26 

35 

34 

23 

09 

94 

00 

80 

55 

31 

63 

27 

91 

74 

97 

80 

30 

65 

07 

71 

30 

01 

84 

47 

45 

89 

70 

74 

13 

04 

90 

51 

27 

61 

34 

63 

87 

44 

2" 

14 

61 

60 

86 

38 

33 

71 

13 

33 

72 

08 

16 

13 

50 

56 

48 

51 

29 

48 

30 

93 

45 

66 

29 

40 

03 

96 

40 

03 

47 

24 

60 

09 

21 

21 

18 

00 

05 

86 

52 

85 

40 

73 

73 

57 

68 

52 

33 

44 

78 

98 

62 

42 

05 

32 

55 

02 

37 

59 

20 

40 

93 

17 

82 

24 

19 

90 

80 

87 

32 

74 

59 

84 

24 

49 

79 

17 

23 

75 

83 

42 

00 

11 

02 

55 

57 

48 

84 

74 

36 

22 

67 

19 

20 

15 

92 

53 

37 

13 

75 

54 

89 

56 

73 

23 

39 

07 

10 

33 

79 

26 

34 

54 

71 

33 

89 

74 

68 

48 

23 

17 

49 

18 

81 

05 

52 

85 

70 

05 

73 

11 

17 

67 

28 

25 

47 

89 

, . 

65 

65 

42 

23 

96 

64 

20 

30 

89 

87 

64 

37 

91 

50 

7. 

">0 

18 

54 

34 

68 

02 

87 

23 

05 

43 

93 

08 

30 

92 

98 

24 

43 

23 

72 

80 

64 

34 

27 

23 

46 

15 

36 

10 

63 

21 

59 

69 

76 

02 

62 

31 

62 

47 

60 

34 

39 

91 

63 

18 

38 

27 

10 

78 

88 

84 

42 

32 

00 

97 

92 

00 

04 

94 

50 

05 

75 

82 

70 

80 

35 

74 

62 

19 

67 

54 

18 

28 

92 

33 

69 

98 

96 

74 

35 

72 

11 

68 

25 

08 

95 

31 

79 

79 

54 

oi 

AO 

fin 

42 

57 

66 

76 

72 

91 

03 

63 

48 

46 

44 

01 

33 

53 

62 

•>8 

80 

59 

55 

05 

02 

16 

13 

17 

54 

06 

36 

63 

06 

15 

03 

72 

38 

01 

58 

25 

37 

66 

48 

56 

19 

56 

41 

29 

28 

76 

49 

74 

39 

50 

92 
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25 
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13 

Excerpted    from    "Experimental    Statistics"    (19)    Table    A. 36    which    may    be    consulted    for  more 
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Table  B.8     Z-Factors  for  Two-Sided  Confidence  Interval 


Confi  dene  e  _  Lev  e 1  Z_Factor 

50$  0.67 

67  1.00 

75  1.15 

90  1.615 

95  1.960 

95. 45  2.000 

99.00  2.575 

99.74  3 

99.9934  4 

99.99995  5 

100  -  10"9  6 

100  -  10~12  7 

100  -  10~15  8 

1 00  -   10"18-9  9 

100   -   10"23  10 


53 


APPENDIX  C.      STATISTICAL  TOOLS 


C.1  Introduction 

The  following  pages  contain  a  brief  description,  with  examples,  of  statistical 
calculations  related  to  some  of  the  questions  that  arise  when  evaluating  chemical  measurement 
data.  The  reader  is  referred  to  the  many  excellent  books  that  are  available  which  discuss 
these  matters  in  more  detail  and  the  basis  for  the  relationships  used.  General  information  on 
precision  measurement  is  contained  in  reference  [16],  and  NBS  Handbook  91  [19]  is  especially 
recommended  for  a  detailed  discussion  of  statistical  concepts.  It  contains  many  numerical 
examples  as  well  as  extensive  tables,  from  which  most  of  the  ones  included  in  Appendix  B  were 
taken . 

The  results  of  repetitive  measurements  are  usually  considered  to  be  normally  distributed 
and  r epr esentable  by  a  bell-shaped  curve.  If  a  series  of  n  measurements  were  made  many  times, 
one  would  obtain  data  sets  represented  by  distributions  such  as  those  in  Figure  C.I.  The 
means  of  each  set  will  differ  from  each  other.  If  the  sample  standard  deviation,  s,  is  cal- 
culated for  each  set  of  such  measurements,  a  different  result  would  be  expected  each  time. 
Figure  C.2  shows  the  two-sided  (assymetric)  confidence  limits  for  o  based  on  such  estimates, 
for  several  probability  levels. 


I      i  1  1  1  1  1  1  1  1  1  1      r  I 

0        1        2        3        4        5        6       7       8       9       10      11      15  13 

Series  Number 

Figure  C.1     Expected  distribution  of  the  means  of  random  samples/measurements.  The 
individual  measurements  are   indicated  by   x  and  the  means,   x,   by  dots 


When  several  series  of  measurements  are  made,  both  the  means  and  the  standard  deviations 
will  vary  from  measurement  to  measurement,  as  illustrated  in  Figure  C.3.  As  n  increases  from 
U   to  1000,    the  variation  of   the  means  decreases  but  never  disappears. 

On  considering  the  above,  it  should  be  obvious  that  even  the  best  of  measurements  will 
differ  amongst  themselves,  whether  made  by  the  same  or  different  laboratories  or  scientists. 
One  often  needs  to  answer  questions  such  as  the  confidence  that  can  be  placed  in  measurement 
data  and  the  significance  of  apparent  differences  resulting  from  measurements.  The  various 
equations  given  in  this  appendix  take  into  account  both  the  expected  variability  within  popu- 
lations and  the  uncertainties  in  the  estimates  of  the  population  parameters  that  must  be 
considered  when  answering  such  questions. 
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C.2     Estimation  of  Standard  Deviation 


The  basic  parameters  which  characterize  a  population  (universe)  of  samples  or  measurements 
on  a  given  sample  are  the  mean,  p,  and  the  standard  deviation,  o.  Unless  the  entire  popula- 
tion is  examined,  p  and  o  cannot  be  known  but  ar_e  estimated  from  sample(s)  randomly  selected 
(assumed)   from   it.     The   result   is   a   sample  mean,    x,    and  an   estimate  of   the   standard  deviation, 


Number  of  Measurements 

Figure  C.2     Confidence   Interval   for  o 

This  figure  illustrates  the  expected  variability  of  estimates  of  standard  deviations 
made  on  various  occasions,  as  a  function  of  the  number  of  measurements  involved.  The 
factor,  when  multiplied  by  the  estimate  of  the  standard  deviation  gives  the  interval 
that  is  expected  to  include  the  population  standard  deviation  for  a  given  percentage  of 
occasions.  The  labels,  e.g.,  10$,  indicate  the  percentage  of  time  that  such  an  interval 
would  not  be  expected  to  include  o.  See  section  C.4  for  a  discussion  and  Table  B . 4  for 
the  factors  to  be  used  in  such  calculations. 


Figure  C.3     Computed   50%   confidence   intervals   for   the   population  mean,   m,    from  100 
samples   of   4,    40  samples   of   100,    and   4   samples   of   1000  [19]. 

The  vertical  lines  essentially  are  error  bars.  The  sample  means,  located  at  the  center 
of  each,  are  not  indicated.  The  sample  means  and  standard  deviations  (proportional  to 
the  error  bars)  vary  with  each  set  of  measurements.  The  error  bars  decrease  inversely 
as  the  square  root  of  the  sample  size  is  increased  and  the  means  show  correspondingly 
smaller   deviations   from   the   population  mean. 


55 


s,  which  must  be  used  if  such  things  as  confidence  intervals,  population  characteristics, 
tolerance  intervals,  comparison  of  precision,  and  the  significance  of  apparent  discrepancies 
in  measured   values   are   to   be  evaluated. 

Several  ways  by  which  the  standard  deviation  may  be  estimated  are  given  in  the  following 
sections. 


C  .  2  .  1      E s  1 1  ma  t  i_ on_of  _S  t  a nd ar  d_D e  v  j^a  t^on_f  r  om_R e£M  c  a  t  e_M e a  su r  erne n  t  s 

For  a  series  of  n  measurements 


x  =  

n 


s   is  estimated  with   v  =  n-1    degrees   of  freedom. 


Example:     C.2.1   -  Series  of  Measurements 


(x^x) 

(x^x)' 

15.2 

.143 

.  0204 

14.7 

-.357 

.  1  257 

15.1 

.043 

.0018 

15.0 

-*057 

.  0033 

15.3 

.243 

.0590 

15.2 

.143 

.  0204 

14.9 

-.157 

.0247 

x   =   1 5.057  Z   =  .2572 

n   =  7 

V.  2572 
, — =  o.207 


C  .  2  .  2     Es  t  i^ma  t  _i  on_o  f  _S  t  an  da  r  d_De  v  ^a  t  i.  on_f  roin_Du£licate_Mea3urements 
If  2k 

where         k     =     number   of   sets  of  duplicate  measurements 

d     =     difference  of   a   duplicate  measurement 

v     =     k  degrees   of  freedom 

Note:  It  is  not  necessary  that  the  duplicate  measurements  be  made  on  the  same  materials.  It 
is  only  necessary  that  the  materials  measured  are  expected  to  have  the  same  standard 
deviation  of  measurement. 


Example  A 


A  : 

C.2.2  - 

Dupl i  cat  es  , 

Same  I 

xf 

xs 

M 

d2 

14.7 

15.0 

0.3 

0.09 

15.1 

14.9 

0  .  2 

0  .04 

15.0 

15.1 

0  .  1 

0.01 

14.9 

14.9 

0  .  0 

0  .  0 

15.3 

14.8 

0.5 

0.25 

14.9 

15.1 

0.  2 

0  .04 

14.9 
I 

15.0 

0  .  1 

0.01 
0.44 
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v  =   7   degrees   of  freedom 


Example  B:      C.2.2   -   Estimation  from  Duplicates,    Different  Materials 


xf 

xs 

|d| 

d2 

14.7 

15.0 

0.3 

0.09 

20  .  1 

19.8 

0.3 

0.09 

1  2.5 

13.0 

0.5 

0.25 

23.6 

23.3 

0.3 

0.09 

15.1 

14.9 

0  .  2 

0.04 

18.2 

18.0 

0  .  2 

0.  04 

20.7 

20.  9 

0  .  2 

0.04 

Z  0.  64 


=  0.21 


v  =   7  degrees   of  freedom 

C . 2 . 3     Estimation  of  Standard  Deviation  from  the  Range 

The  range,  R,  of  a  series  of  measur ements_ is  defined  as  the  difference  of  the  highest  and 
lowest  value  obtained.  The  average  range,  R,  based  on  several  sets  ( k )  of  measurements  is 
calculated . 


R1    +    R2    +    Rg   +    .    .    .    .    +  Rk 


s  =  R/d2 

The  value_for  d2  is  obtained  from  Table  B.2  and  will  depend  on  the  number  of  sets,  k,  used 
to  calculate  R  and  the  number  of  measurements  in  a  set.  The  table  shows  also  the  number  of 
degrees  of  freedom  for  the  estimate  of  the  standard  deviation.  The  materials  used  to  estimate 
R  may  be   different,   with   the   same   restrictions   as   noted   in  C.2.2. 


Example   C.2.3   -  From  the   Range  of   Duplicate  Measurements 


First 

Result 

Second 

Result 

Range 

1  4 

5 

1  4 

2 

0.3 

1  4 

8 

1  4 

9 

0  .  1 

1  4 

05 

1  4 

3 

0.25 

1  4 

2 

1  4 

8 

0.6 

1  4 

9 

1  4 

9 

0.0 

1  4 

3 

1  4 

4 

0.  1 

1  4 

7 

1  4 

1 

0.6 

1  4 

4 

1  4 

7 

0.3 

1  4 

1 

1  4 

25 

0.15 

1  4 

95 

1  4 

65 

0.3 

1  4 

25 

1  4 

95 

0.7 

0.3   +   0.1    +   0-25    +   0.6    +0+0.1    +   0.6    +   0.3    +   0-15    +   0.3    +  0.7 
1  1 

0.309 
R/d2 

1.16   for    11    estimates   of   R   (see  Table   B.2   by  interpolation) 

0.27 


10  degrees   of   freedom   for   s    (from  Table   B.2,  rounded) 
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C  .  2  .  4     P oo_l  i_ng_Es  t  .1  ma  t  e s_o f  _S  t  a n d a r  d_De v  .i  a  t  i.  on  s 

Several  estimates  of  the  standard  deviation  may  be  pooled  to  obtain  a  better  estimate 
Given  several  estimates  of  the  standard  deviation  obtained  on  several  occasions,  with  the 
corresponding  measurements: 


'2  "2  v2  "3 

'3  n3  v3     =  n3" 


I        2  2 
V1S1    +    V2S2  + 


2 

'ksk 


pooled 

spooled  wil1   be   based   on    (v1    +   v2   +    •    •    •    +   vk)    degrees   of  freedom 
Note:      Ordinarily,    v  =  n-1 

Example   C.2.4   -  Pooling  Standard  Deviations 

The  standard  deviation  of  a  measurement  process  was  estimated  on  five  occasions.  These 
are   to   be   pooled   to   improve   the   estimate  of  o. 


1  0.171  7  6 

2  0.205  5  H 

3  0. 1 85  7  6 
1  0.222  4  3 
5  0. 1 80  5  4 


6  (  0  .  1  85  )  2    +    3  (  0  .  222  )  2    +    4(0. 1 80  )  2 


0.1755    +   0.1681    +   0.2054    +   0.1479   +  0.1296 


23 

Sp   =     0.190  with   23   degrees   of  freedom. 

C.3     Do  Two   Estimates   of   Precision  Differ? 

Conduct   an   F   test,    as  follows: 

Let   s-|    =   estimate   of   standard   deviation    (larger   value)    based   on  n^  measurements. 

Let   s2   =   estimate   of   standard   deviation    (smaller   value)    based  on  n2  measurements. 
In   each   case,    the   respective   degrees   of   freedom,    v  =  n-1. 

2  2 

Calculate   F   =   s 1 /s2 

Look  up  critical  value  of  FQ  in  Table  B.3,  based  on  the  respective  degrees  of  freedom  for 
the   estimates   of   s-|    and  s2. 

If   F>FC   consider       >s2   at   the   chosen   level   of  confidence 

If   F<FC   there   is   no   reason   to   believe   that   Si>s2  at   the   chosen  level 
of  confidence 

Example  C.3   _  Comparison  of   Precision  Estimates 


2.00  n1      =     6  v,      =  5 

2 


n,     =     >  v,  = 


F       =     4.00/1  .  00   =   4  .0 
FQ     =     7.15   at   5?   level   of  significance 
Conclusion:      There   is   no   reason   to   believe   that  s 
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C.4     What  are  the  Confidence  Limits  for   an  Estimate  of   a  Standard  Deviation? 

The  width  of  the  confidence  interval  for  an  estimated  standard  deviation  will  depend  on 
the  number  of  degrees  of  freedom,  v,  upon  which  the  estimate  is  based  (v=n-1).  The  interval 
is  not  symmetrical  (see  Figure  C.2),  as  in  the  case  for  a  mean,  since  a  small  number  of  mea- 
surements tend  to  underestimate  the  standard  deviation.  To  calculate  the  bounds  of  the  inter- 
val, one  may  use  a  table  such  as  Table  B.4  and  find  the  factors  By  and  BL  corresponding  to  the 
number  of  degrees  of  freedom  involved  and  the  confidence  level  sought.  In  the  table,  a  =  0.05 
corresponds  to  a  confidence  of  95%  for  the  interval  so  calculated.  The  confidence  interval  is 
then,   sBL  to  sBy. 

Example  C.I   -  Confidence  Limits   for   Estimate  of  Standard  Deviations 
s     =     0. 1 5  n=10,  v     =  9 

For   a  =   .05.    v  =   9    (see  Table  B . 4)    one   finds   By     =     1 .746;    BL  -  0.6657 
The   confidence   interval   for   s   is   0.15  x   0.6657   to   0.15  x    1.746  or   0.10   to  0.26. 


C.5     Confidence   Interval   for   a  Mean 

The  confidence  interval  for  the  mean  will  depend  on  the  number  of  measurements,  n,  the 
standard  deviation,  s,  and  the  level  of  confidence  desired.  The  confidence  interval  is 
calculated   using  the  expression 

ts 

x   ±  — 

/n 

The  value  for  t  (see  Table  B.5)  will  depend  on  the  level  of  confidence  desired  and  the  number 
of  degrees  of  freedom,  v,  associated  with_the  estimation  of  s.  If  s  is  based  on  the  set  of 
measurements  used  to  calculate  the  mean,  x,  then  v  =  n-1.  If  the  measurements  are  made  by  a 
system  under  statistical  control,  as  demonstrated  by  a  control  chart,  v  will  depend  on  the 
number   of  measurements  made   to   establish   the   control  limits. 

Example  C.5   -  Confidence   Interval   Based  on  s   Estimated  from  Data  Set   of  Seven  Measurements 

x     =  10.05 
s     =  0.11 
n       =  7 
v       =  6 

For  a   95   percent   level   of   confidence,    t   =   2.447,  hence 
2.447  x  0.11 

10.05   ±    =    10.05      ±   0.10,    or    9.95   to  10.15 

/7 


Example  C.5  -  Confidence  Interval  Based  on  s  Obtained  from  Control  Chart  Limits, 
One  Measurement  of  x 

x      =  10.05 
s      =  0.11 

v     =     45    (control  chart) 
n     =  1 

For   a   95   percent   level   of   confidence,    t   =   2.016,  hence 
2.016   x  0.11 

10.05   ±    =    10.05    ±   0.22   or    9.83    to  10.27 

/I 

Values   of   t   in  the   above   were  obtained   from  Table   B.5,    by  interpolation 


Note:  There  is  no  statistical  basis  for  a  confidence  level  statement  for  one  measurement 
unless   supported   by   a   control   chart   or   other   evidence   of   statistical  control. 


C.6     Do  the  Means   of  Two  Measured  Values   Disagree,  significantly? 

The  decision  on  disagreement  is  based  on  whether  the  difference,  A,  of  the  two  values 
exceeds  its  statistical  uncertainty,  U.  The  method  used  for  calculation  of  the  uncertainty 
depends  on  whether  or  not  the  respective  standard  deviation  estimates  may  be  considered  to  be 
significantly  different. 

£3Ji.e_I         No    reason    to    believe    that    the    standard    deviations    differ    (e.g.,    same  method, 
analyst,    experimental   conditions,  etc.). 

Step   1  Chose   the   significance   level   of   the  test. 

Step   2         Calculate   a    pooled   standard    deviation   from   the    two    estimates    to   obtain   a  better 
estimate  of   the   standard  deviation. 


A  B 

Sp  will    be   based  on   vA   +   vR   degrees   of  freedom 
Step   3         Calculate   the   uncertainty,    U ,    of   the  difference 


VnA  +  1 
nA  n, 


Step   4         Compare   A  =    |    xA   -   xB    |    with  U 

If   A  <  U,    there   is   no   reason   to   believe   that   the   means  disagree, 

Example 

x.   =   4.25  xR  =  4.39 


=  0.13  sR   =  0.  1  7 


nA   =   7  nB   =  10 

vA  =   6  vB   =  9 

A     =    |    x,   -  xB   |    =   1.25   -   1.39    |  =0.11 


Step   1  a  =   0.05    (95?  confidence) 

Step  2 


V6    ( 0 . 1 3 ) 2    +   9  (.17)' 
I   

V. 1 01 4  +  .2061 
 »  

0.155 

jim 

\  70 


Step    3  U  =2.131x0.155 

U     =  0.080 

Step   1         .14   >  .080 

Conclude   that   4.39   differs   from   4.25  at   the   95%   level   of  confidence. 

Case_I_I       Reason      to      believe      that      the     standard     deivations      differ      (e.g.,  different 
experimental   conditions,    different   laboratories,  etc.) 

Step   1  Chose   a,    the   significance   level   of   the  list. 

Step   2         Compute   the   estimated   variance   of   each  value 

s2  S2 

A  B 


Step   3         Compute   the   effective   number   of   degrees   of   freedom,  f 


(vA  ♦  vBr 

f    =  2 


Step   4         Compute   the   uncertainty,   U,    of   the  difference 


u  =  t- yTA  ♦  vB  , 

Step   5         Compute   A  with  U 

If   A   is   <  U   there   is   no   reason   to   believe   that   the  means  disagree 

Exampl e 

x„   =     4.25  xR   =  4.39 


0.13 


0.17 


Step  1 
Step  2 


a   =   0.05    (95?  confidence) 


2.414   x  10' 


Step  3 


(  2.  41  4   x    10   3   +   2.89   x    10  3)' 


2.414   x  10' 


2.89   x  10" 


Step    4  U 
U 

Step   5  A 


=   1  7 


2.11  V2.41  4  x 
0.153 

0.14      :      U  =  0. 


•3  + 


Conclude   there   is   no  reason  to   believe   that  x, 


at   95?   level   of  confidence 


C.7     Statistical   Tolerance  Intervals 

A  tolerance  interval  represents  the  limits  within  which  a  specified  percentage  of  the 
population  is  expected  to  lie  with  a  given  probability.  It  is  especially  useful  to  specify 
the  variability  in  composition  of  samples.  If  the  standard  deviation  of  the  population  of 
samples  were  known,  the  limits  for  a  given  percentage  of  the  population  could  be  calculated 
with  certainty.  Because  only  an  estimate  of  the  standard  deviation  is  usually  known,  based  on 
a  limited  sampling  of  the  population,  a  tolerance  interval,  based  on  inclusion  of  a  percentage 
of   the   population  with   a  specific   probability   of   inclusion,    is   all   that   can   be  calculated. 

The   calculation   is  made   as  follows: 

Tolerance   Interval   =   x   ±  ks 

where  k  =  a  factor  (obtained  from  Table  B.6  for  example)  based  on  the  percentage  of 
population  to  be  included,  the  probability  of  inclusion,  and  the  number  of 
measurements   used   to   calculate   x  and  s. 


Example   C.7   -  Statistical   Tolerance  Interval 

For  measurements   of   ten   samples   of   a  shipment   of   coal,    the  sulfur   content  was   found   to  be 

x   =   1 . 62?  s  =   0. 1 0%  n   =   1 0 

From  Table   B.6,    k   =   3-379   for   Y   =   0.95,    p   =   0.95,    n   =  10. 
The   tolerance   interval    is   thus   1.62%   ±  0.34?   or   1.28?   to  1.96?. 
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C.8     Pooling  Means   to  Obtain   a  Grand  Average,  x 

Case_i         A11   means   based  on   same   number   of   measurements   of   equal  precision 

+  x2  +  x3  +   .    .    .    +  xn 


Case_I_I^       Means    based    on    different    number    of    measurements,    but    no    reason    to    believe  the 
precisions  differ 


Case  II_I  Means  based  on  different  number  of  measurements  with  differing  precisions 
Step    1  Compute  weight   to   be   used   for   each  mean, 

ni  n1 
w,   =    —      e.g. ,  w,   =  — 


Step  2 


Example  C.8  Case   -   Calculation  of   Grand  Average,    Case  I 

To  calculate  the  grand  average,  x,  of  the  following  means,  all  believed  to  be  equally 
precise . 


*1  = 

1  0 

50 

*2  = 

1  0 

37 

*3  = 

1  0 

49 

m  - 

1  0 

15 

x5  " 

1  0 

47 

1  0 

50 

0.  37 


0.45   +  10.47 


Example  C. 


Calculation  of   Grand   Average,   Case  II 


1  0.  50 
10.37 
10.49 
10.45 
10.47 

10.50   xl 0 


n  =  1  0 

n  =  5 

n   =  20 

n  =  5 

n   =  7 


10.49   x  20 


0.45   x  5 


Example  C, 


Calculation  of   Grand  Average,    Case  III 


1 0. 50   x   1 000 


i 

xi 

ni 

si 

w  ^ 

1  0  . 

50 

1  0 

.  1  0 

1  000 

2 

1  0  . 

37 

5 

.  1  5 

222 

3 

1  0. 

49 

20 

.  1  1 

1  652 

4 

1  0. 

45 

5 

.  1  0 

500 

5 

1  0  . 

47 

7 

.  1  6 

273 

37   x  222 

0.49  x 

1652  + 

10.45  x 

500  + 

1000  + 

222 

+    1  652 

+  500 

+  273 
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x  =   1 0. H78 


s=    =   2.71   x    10  ^ 
s=   =  0.0166 

C.9  Outliers 

Outliers  are  data  values  that  do  not  belong  or  have  a  very  low  probability  of  belonging  to 
the  data  set  in  which  they  occur.  They  can  result  from  such  causes  as  blunders  or  malfunc- 
tions of  the  methodology,  or  from  unusual  losses  or  contamination.  If  outliers  occur  too 
often,  there  may  be  deficiencies  in  the  quality  control  program  which  can  be  corrected  and 
thus   improve   the  measurements.      One  should   always   look   for   causes  when   data  are  rejected. 

Outliers  can  be  identified  when  data  are  plotted,  when  results  are  ranked,  and  when 
control  limits  are  exceeded.  Only  when  a  measurement  system  is  well  understood  and  the 
variance  is  well  established,  or  when  a  large  body  of  data  are  available,  is  it  possible  to 
distinguish   between  extreme   values   and   true  outliers   with  any   degree   of  confidence. 

The  following  rules  for  rejection  of  data  should  be  used  with  caution  since  an  outlier  in 
a  well-behaved  measurement   system   should   be   a  rare  occurrence. 

A.      Rejection   for   Assignable  Cause 


System  malfunction,  m i s i de n t i f i ca t i on  of 
known   contamination,    are  examples. 


a    sample,     suspected    transcription  error, 


Rule   of   Huge  Error 


If  the  questioned  value  differs  from  the  mean  by  some  multiple,  M,  of  the  standard 
deviation,  it  may  be  considered  to  be  an  outlier.  The  size  of  the  multiple  depends  on 
the   confidence   required   for   rejection.     One  evaluates 


A  practical  rule  might  be  to  use  M  >  4  as  a  criterion  for  rejection.  This  corresponds 
to  a  significance  level  of  <  2%  when  the  standard  deviation  is  well  established,  such 
as   based  on   a   data   set   of   15  or  larger. 

If  s  is  not  well  established  but  depends  on  the  data  set  in  question,  the  odds  for 
rejection  are  much  larger.  For  example,  if  x  and  s  are  based  on  6  measurements,  M  >  6 
would   be   the   criterion   for   rejection   for   a   2%   level   of  significance. 

C.      Statistical  Tests 


Several  statistical  tests  are  available  for  identifying  outliers  based  on  ranking  data 
and  testing  extreme  values  for  credibility.  The  Dixon  criterion  is  described  on  page 
17-3  of  NBS  Handbook  91  (19)  and  the  critical  values  for  decision  on  rejection  using 
this    criterion   are   given   in  Table   A  - 1  4   of    the   same  reference. 

A  method  for  identifying  outlier  laboratories  in  a  collaborative  test  or  proficiency 
testing  program  is  described  by  Youden  on  page  118  of  NBS  Special  Publication  300  (16) 
where  a  table  of  the  test  score  values  necessary  to  use  his  criterion  also  will  be 
found. 
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Example  C.9   -  Outliers,    Huge   Error  Concept 


x    (original   order)  x    (ranked  order) 

10.50  10.M5 

1  0.  47  10.47 

10.49  10.47 
10.45  1 0.48 

10.47  10.49 
10.57  10.50 

10.52  10.50 

10.50  1 0.52 

10.48  10.53 

10.53  10.57 

1.  10.57   appears   to   be   an  outlier 

2.  Calculate  mean   and   sample   standard  deviation,    s,    ignoring  10.57 

x     =   10.490  s   =  0.0255 

3.  1°-57  -   1°-49     =  3.13 

0.0255 

4.  Since   3.13   <    4   conclude   that    10.57  should   be   retained   in   the  data 

5.  Calculate  mean   and   sample   standard  deviation   including  10.57 
x   =   10. 498                     s   =  0.0349 


C.10     Use  of   Random  Number  Tables 

It  is  often  desirable  to  randomize  the  sequence  in  which  measurements  are  made,  samples 
are  chosen,  and  other  variables  of  an  analytical  program  are  set.  Tables  of  random  numbers, 
such  as  Table  B.7,  are  a  convenient  and  simple  way  to  accomplish  this.  The  following 
procedure  may   be  used. 

1.  Number  the  samples  (or  measurements)  serially,  say  00  to  xy .  For  example,  00  to  15 
for    16   i  terns . 

2.  Start  at  any  randomly  selected  place  in  the  table  and  proceed  from  that  point  in  any 
systematic  path.  The  order  in  which  the  item  numbers  are  located  becomes  the  random 
sequence   number   to   be   assigned   to  them. 

Example:  Start  at  Row  7,  Column  3  (chosen  by  chance)  of  Table  B.7  and  proceed  from  left  to 
right  as  in  reading.  The  first  number  is  76  which  is  not  usable  for  the  above 
series  of  items.  The  first  usable  number  is  15.  Proceeding  as  above,  the  items  are 
located  in  the  following  order:  15,  06,  02,  03,  05,  00,  11,  13,  07,  10,  09,  08,  04, 
14,  01,  12.  If  a  number  already  chosen  is  encountered,  pass  over  it  to  the  next 
usable  number . 
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Quality  Assurance  of 


Figure  1.  Measurement  tolerances  and  errors 


The  objective  of  quality  assurance 
programs  for  analytical  measurements 
is  to  reduce  measurement  errors  to 
tolerable  limits  and  to  provide  a 
means  of  ensuring  that  the  measure- 
ments generated  have  a  high  probabil- 
ity of  being  of  acceptable  quality.  Two 
concepts  are  involved.  Quality  control 
is  the  mechanism  established  to  con- 
trol errors,  while  quality  assessment 
is  the  mechanism  to  verify  that  the 
system  is  operating  within  acceptable 
limits.  General  handbooks  that  dis- 
cuss quality  assurance  in  more  detail 
are  given  in  References  1-3. 

Quality  is  a  subjective  term.  What  is 
high  quality  in  one  situation  may  be 
low  or  unacceptable  quality  in  another 
case.  Clearly  the  tolerable  limits  of 
error  must  be  established  for  each. 
Along  with  this  there  must  be  a  clear 
understanding  of  the  measurement 
process  and  its  capability  to  provide 
the  results  desired. 

The  tolerance  limits  for  the  proper- 
ty to  be  measured  are  the  first  condi- 
tions to  be  determined.  These  are 
based  upon  the  considered  judgment 
of  the  end  user  of  the  data  and  repre- 


sent the  best  estimate  of  the  limits 
within  which  the  measured  property 
must  be  known,  to  be  useful  for  its  in- 
tended purpose.  The  limits  must  be 
realistic  and  defined  on  the  basis  of 
cost-benefit  considerations.  It  is  bet- 
ter to  err  on  the  side  of  too-narrow 
limits.  Yet,  measurement  costs  nor- 
mally increase  as  tolerances  are  de- 
creased, so  that  the  number  of  mea- 
surements possible  for  a  fixed  budget 
may  be  inadequate  when  coupled  with 
material-variability  considerations. 

Once  one  has  determined  the  toler- 
ance limits  for  the  measured  property, 
the  permissible  tolerances  in  measure- 
ment error  may  be  established.  The 
basis  for  this  is  shown  in  Figure  1.  The 
tolerance  limits  for  the  measured 
property  are  indicated  by  Lp.  Uncer- 
tainties in  the  measurement,  based  on 
the  experience  and  judgment  of  the 
analyst,  are  indicated  by  Cm.  These 
include  estimates  of  the  bounds  for 
the  biases  (systematic  errors),  B,  and 
the  random  errors  as  indicated  by  s, 
the  estimate  of  the  standard  devia- 
tion. Obviously,  Cm  must  be  less  than 
Lp  if  the  data  are  to  be  useful.  The 


confidence  limits  for  x,  the  mean  of  n 
replicate  measurements,  are: 


in  which  t  is  the  so-called  student  fac- 
tor. While  the  effect  of  random  error  is 
minimized  by  replication  of  measure- 
ments, there  are  practical  limitations, 
and  any  measurement  process  that  re- 
quires a  large  number  of  replicates  has 
a  serious  disadvantage. 

Weil-designed  and  well-implement- 
ed quality  assurance  programs  provide 
the  means  to  operate  a  measurement 
system  in  a  state  of  statistical  control, 
thereby  providing  the  basis  for  estab- 
lishing reliable  confidence  limits  for 
the  data  output. 


Until  a  measurement  operation  .  .  . 
has  attained  a  state  of  statistical  con- 
trol, it  cannot  be  regarded  in  any  logi- 
cal sense  as  measuring  anything  at 
all. 

C.  E.  Eisenhart 


The  Analytical  System 

Analytical  measurements  are  made 
because  it  is  believed  that  composi- 
tional information  is  needed  for  some 
end  use  in  problem  solving.  Explicitly 
or  implicitly,  a  measurement  system 
such  as  that  depicted  in  Figure  2  is  in- 
volved. One  must  have  full  under- 
standing of  the  measurement  system 
for  each  specific  situation  in  order  to 
generate  quality  data. 

The  conceptualization  of  the  prob- 
lem, including  the  data  requirements 
and  their  application,  constitutes  the 
model.  The  plan,  based  on  the  model, 
includes  details  of  sampling,  measure- 
ment, calibration,  and  quality  assur- 
ance. Various  constraints  such  as  time, 
resources,  and  the  availability  of  sam- 
ples may  necessitate  compromises  in 
the  plan.  Adequate  planning  will  re- 
quire the  collaboration  of  the  analyst, 
the  statistician,  and  the  end  user  of 
the  data  in  all  but  the  most  routine 
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cases.  In  complex  situations,  planning 
may  be  an  iterative  process  in  which 
the  actual  data  output  may  require  re- 
consideration of  the  model  and  revi- 
sion of  the  plan. 

Sampling  has  been  discussed  in  a 
recent  paper  (4).  Obviously,  the  sam- 
ple is  one  of  the  critical  elements  of 
the  measurement  process.  Closely  re- 
lated is  the  measurement  methodolo- 
gy to  be  used.  The  method  used  must 
be  adequate  for  the  intended  purpose 
and  it  must  be  properly  utilized.  The 
necessary  characteristics  of  a  suitable 
method  include:  adequate  sensitivity, 
selectivity,  accuracy,  and  precision.  It 
is  desirable  that  it  also  have  the  fol- 
lowing characteristics:  large  dynamic 
measurement  range;  ease  of  operation; 
multiconstituent  applicability;  low 
cost;  ruggedness;  portability.  To  judge 
its  suitability,  the  following  informa- 
tion must  be  known  about  it:  type  of 
sample;  forms  determined;  range  of 
applicability;  limit  of  detection;  bias- 
es; interferences;  calibration  require- 
ments; operational  skills  required; 
precision;  and  accuracy.  Obviously  all 
of  the  above  characteristics  must 
match  the  measurement  require- 
ments. In  case  of  doubt,  trial  measure- 
ments must  be  made  to  demonstrate 
applicability  to  a  given  problem.  A 
cost-benefit  analysis  may  be  needed 
to  determine  which  of  several  candi- 
date methods  is  to  be  selected.  A 
method,  once  adopted,  must  be  used 
in  a  reliable  and  consistent  manner,  in 
order  to  provide  reproducible  data. 
This  is  best  accomplished  by  following 
detailed  written  procedures  called 
Standard  Operating  Procedures 
(SOPs)  in  quality  assurance  terminol- 
ogy. Standard  methods  developed  by 
voluntary  standardization  organiza- 
tions are  often  good  candidates  for 
SOPs,  when  they  are  available. 

Two  kinds  of  calibrations  are  re- 
quired in  most  cases.  Physical  calibra- 
tions may  be  needed  for  the  measure- 
ment equipment  itself  and  for  ancil- 
lary measurements  such  as  time,  tem- 
perature, volume,  and  mass.  The  mea- 
surement apparatus  may  include 


built-in  or  auxiliary  tests  such  as  volt- 
age checks,  which  may  need  periodic 
verification  of  their  stability  if  not  of 
their  absolute  values.  But  especially, 
most  analytical  equipment  requires 
some  kind  of  chemical  calibration, 
often  called  standardization,  to  estab- 
lish the  analytical  function  (i.e.,  the 
relation  of  instrument  response  to 
chemical  quantification).  Obviously, 
the  analyst  must  thoroughly  under- 
stand each  of  the  calibrations  required 
for  a  particular  measurement.  This  in- 
cludes a  knowledge  of  the  standards 
needed  and  their  relation  to  the  mea- 
surement process,  the  frequency  of 
calibration,  the  effect  on  a  measure- 
ment system  due  to  lack  of  calibra- 
tion, and  even  the  shock  to  the  system 
resulting  from  recalibration. 


Quality  Control 

Quality  control  encompasses  all  of 
the  techniques  used  to  encourage  re- 
producibility of  the  output  of  the  mea- 
surement system.  It  consists  of  the  use 
of  a  series  of  protocols  developed  in 
advance  and  based  on  an  intimate  un- 
derstanding of  the  measurement  pro- 
cess and  the  definite  requirements  of 
the  specific  measurement  situation. 
Protocols,  i.e.,  procedures  that  must 
be  rigorously  followed,  should  be  es- 
tablished for  sampling,  measurement, 
calibration,  and  data  handling.  Some 
of  these,  or  at  least  selected  portions, 
may  be  applicable  to  most  or  all  of  the 
measurements  of  a  particular  labora- 
tory and  become  the  basis  of  a  good 
laboratory  practices  manual  (GLPM). 


Planning  Primary  

Secondary  "=■——» 

Data  Flow   

Figure  2.  Analytical  measurement  system 
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I   Measurement 

1 

Inspection  | 

I 

Data 


Inspection       » Reject 


Accept 

Figure  3.  Quality  control  by  inspection 

In  fact,  the  GLPM  should  cover  the 
generalities,  if  not  the  specifics,  of  all 
measurement  practices  of  the  labora- 
tory. The  protocols  for  a  specific  mea- 
surement process  include  the  GLPs 
together  with  any  requirements  of  the 
specific  situation. 

The  GLPM  and  protocols  should  be 
developed  collaboratively  by  all  of 
those  involved  in  the  measurements, 
and  this  development  process  may  be 
the  most  important  aspect  of  their 
function.  It  encourages  a  keen  consid- 
eration of  the  measurement  process 
and  creates  an  awareness  of  potential 
problems  that  GLPs  attempt  to  avert. 

Protocols  are  of  little  use  unless 
they  are  followed  rigorously,  and  the 
attitudes  of  laboratory  personnel  are 
certainly  key  factors  in  this  regard. 
Analysts  must  aspire  to  produce  high 
quality  data  and  must  be  their  own 
most  severe  critics.  Notwithstanding, 
good  quality  control  systems  should 
include  provisions  for  inspection,  both 
periodically  and  aperiodically  (unan- 
nounced) to  ascertain  how  well  they 
are  functioning.  Large  laboratories 
may  have  a  quality  control  officer  or 
group,  independent  of  the  laboratory 
management,  that  oversees  the  opera- 
tion of  the  quality  control  system. 

Quality  Control  by  Inspection 

An  informal  kind  of  quality  control 
involves  the  frequent  if  not  constant 
inspection  of  certain  aspects  of  the 
measurement  system  for  real  or  ap- 
parent problems  (5).  The  essential 


features  of  such  a  system  are  depicted 
in  Figure  3.  Based  on  an  intimate 
knowledge  of  the  measurement  pro- 
cess, samples  may  be  casually  inspect- 
ed for  their  adequacy.  The  rejection 
and  possible  replacement  of  obviously 
unsuitable  ones  can  eliminate  not  only 
extra  work  but  also  erroneous  data 
that  might  be  difficult  to  identify 
later.  Difficulties  in  the  actual  mea- 
surement may  often  be  identified  as 
they  occur  and  remedial  measures,  in- 
cluding remeasurement,  may  be  taken 
either  to  save  data  that  might  other- 
wise be  lost  or  at  least  to  provide  valid 
reasons  for  any  rejections.  Likewise, 
data  inspection  can  identify  problems 
and  initiate  remedial  actions,  includ- 
ing new  measurements,  while  it  is  still 
possible  to  do  so. 

Control  Charts 

The  performance  of  a  measurement 
system  can  be  demonstrated  by  the 
measurement  of  homogeneous  and 
stable  control  samples  in  a  planned  re- 
petitive process.  The  data  so  generat- 
ed may  be  plotted  as  a  control  chart  in 
a  manner  to  indicate  whether  the 
measurement  system  is  in  a  state  of 
statistical  control.  Either  the  result  of 
a  single  measurement  on  the  control 
sample,  the  difference  between  dupli- 
cate measurements,  or  both  may  be 
plotted  sequentially.  The  first  mode 
may  be  an  indicator  of  both  precision 
and  bias,  while  the  second  monitors 
precision  only. 

To  effectively  use  such  a  chart,  the 
standard  deviation  of  a  single  mea- 
surement of  the  control  sample  must 
be  known.  This  may  be  obtained  by  a 
series  of  measurements  of  the  control 
sample,  or  it  may  be  obtained  from 
the  experience  of  the  laboratory  on 
measuring  similar  samples.  Control 
limits,  i.e.,  the  extreme  values  believed 
to  be  credible,  are  computed  from  the 
standard  deviation.  For  example,  the 
2<j  limit  represents  those  within  which 
the  values  are  expected  to  lie  95%  of 
the  time.  The  3<r  limit  represents  the 
99.7%  confidence  level.  Departures 
from  the  former  are  warnings  of  possi- 
ble trouble,  while  exceeding  the  latter 
usually  means  corrective  action  is 
needed.  In  the  event  that  the  standard 
deviation  cannot  be  estimated  with 
sufficient  confidence  initially,  the  con- 
trol chart  may  be  drawn  using  the  best 
estimate,  and  the  limits  may  be  modi- 
fied on  the  basis  of  increasing  mea- 
surement experience. 

The  development  of  a  control  chart 
must  include  the  rationale  for  its  use. 
There  must  be  a  definite  relation  be- 
tween the  control  measurements  and 
the  process  they  are  designed  to  con- 
trol. While  the  control  chart  only  sig- 
nifies the  degree  of  replication  of  mea- 
surements of  the  control  sample,  its 
purpose  is  to  provide  confidence  in  the 


measurement  process.  To  do  this,  the 
control  measurements  must  simulate 
the  measurements  normally  made.  In 
chemical  measurements,  this  means 
simulation  of  matrix,  simulation  of 
concentration  levels,  and  simulation  of 
sampling.  The  latter  objective  may  be 
difficult  if  not  impossible  to  achieve. 
It  must  be  further  emphasized  that 
the  control  measurements  should  be 
random  members  of  the  measurement 
routine,  or  at  least  they  should  not  oc- 
cupy biased  positions  in  any  measure- 
ment sequence. 

To  the  extent  that  control  samples 
are  representative  of  the  test  samples, 
and  to  the  extent  that  measurements 
of  them  are  representative  of  the  mea- 
surement process,  the  existence  of  sta- 
tistical control  for  these  samples  can 
imply  such  control  of  the  measure- 
ment process  and  likewise  of  the  re- 
sults obtained  for  the  test  samples. 

No  specific  statements  can  be  made 
about  the  frequency  of  use  of  control 
samples.  Until  a  measurement  pro- 
cess is  well  understood,  control  sam- 
ples may  need  to  be  measured  fre- 
quently. As  it  is  demonstrated  to  be  in 
control,  the  need  may  become  less  and 
the  incentive  to  do  "extra"  work  may 
diminish.  Along  with  the  decision  on 
how  much  effort  should  be  devoted  to 
quality  control  the  risks  and  conse- 
quences of  undetected  loss  of  control 
must  be  weighed.  Many  laboratories 
consider  that  the  5-15%  extra  effort 
ordinarily  required  for  all  aspects  of 
quality  control  is  a  small  price  to  pay 
for  the  quality  assurance  it  provides. 
When  measurements  are  made  on  a 
frequently  recurring  schedule,  internal 
controls,  such  as  duplicate  measure- 
ments of  test  samples,  can  provide  evi- 
dence of  reproducibility  so  that  con- 
trol samples  may  be  used  largely  to 
identify  systematic  errors,  drifts,  or 
other  types  of  problems. 

When  laboratories  are  engaged  in  a 
variety  of  measurements,  the  use  of 
representative  control  samples  may  be 
difficult  if  not  impossible.  In  such 
cases,  often  only  the  measurement 
methodology  can  be  tested,  and  evalu- 
ation of  the  quality  of  the  measure- 
ment output  requires  considerable 
judgment.  In  such  cases,  the  experi- 
ence of  the  lab  becomes  a  key  factor. 

In  some  complex  measurement  sys- 
tems, certain  steps  or  subsystems  are 
more  critical  than  others,  and  hence  it 
may  be  more  important  to  develop 
control  charts  for  them  than  for  the 
entire  system.  The  control  of  such 
steps  may  indeed  prevent  propagation 
of  error  into  the  end  result.  An  exam- 
ple is  the  sampling  step,  which  may  be 
very  critical  with  respect  to  the  end 
result.  In  such  a  case,  the  records  of 
periodic  inspections  may  be  adaptable 
to  the  control  chart  technique  of  qual- 
ity control. 
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Quality  Assessment 

Procedures  used  to  evaluate  the  ef- 
fectiveness of  the  quality  control  sys- 
tem may  be  classified  according  to 
whether  the  evidence  arises  from  in- 
ternal or  external  sources.  Internal 
procedures,  useful  largely  for  estimat- 
ing precision,  include  the  use  of  inter- 
nal reference  samples  and  control 
charts  to  monitor  the  overall  perfor- 
mance of  the  measurement  system  as 
described  in  an  earlier  section.  Repli- 
cate measurements  on  replicate  or 
split  samples  can  provide  valuable  in- 
sight into  the  reproducibility  of  both 
the  measurement  and  sampling  pro- 
cesses. Comparison  of  the  results  ob- 
tained as  a  consequence  of  inter- 
change of  analysts,  equipment,  or 
combinations  of  these  can  attest  to  op- 
erational stability  as  well  as  identify 
malfunctions.  Measurements  made  on 
split  samples  using  a  completely  inde- 
pendent method  can  lend  confidence 
to  the  method  normally  in  use  or  indi- 
cate the  presence  of  measurement 
bias. 

External  quality  assessment  is  al- 
ways needed  since  it  can  detect  prob- 
lems of  bias  that  are  difficult  to  iden- 
tify by  internal  procedures.  Participa- 
tion in  collaborative  tests,  exchange  of 
samples  with  other  laboratories,  and 
the  use  of  certified  reference  materials 
are  time-honored  assessment  devices. 
NBS  Standard  Reference  Materials 
(SRMs)  (6)  are  especially  useful  for 
quality  assessment  in  cases  where  they 
are  available  and  applicable.  The  in- 
formation that  can  be  obtained  or  in- 
ferred by  their  use  is  described  in  a 
later  section.  Operators  of  monitoring 
networks  may  provide  proficiency 
testing  or  audit  samples  to  assess  labo- 
ratory performance.  Ordinary  prac- 
tices should  be  used  here,  so  that  nor- 
mal rather  than  optimum  perfor- 
mance is  measured. 

A  laboratory  should  diligently  use 
the  information  obtained  in  the  quali- 
ty assessment  process.  Adverse  data 
should  not  be  treated  in  a  defensive 
manner  but  the  reason  for  it  should  be 
investigated  objectively  and  thorough- 
ly. When  laboratory  records  are  reli- 
ably and  faithfully  kept,  the  task  of 
identifying  causes  of  problems  is  made 
easier.  This  is  an  important  reason  for 
developing  data  handling  protocols 
and  ensuring  that  all  protocols  are 
strictly  followed. 

Systematic  Errors 

Systematic  errors  or  biases  are  of 
two  kinds — concentration-level  inde- 
pendent (constant),  and  concentation- 
level  related.  The  former  are  some- 
times called  additive  while  the  latter 
are  called  multiplicative.  Both  kinds 
may  be  present  simultaneously  in  a 
given  measurement  system.  An  exam- 
ple of  the  first  kind  is  the  reagent 


blank  often  present  in  measurements 
involving  chemical  processing  steps. 
The  second  kind  can  result  from,  for 
example,  use  of  an  inaccurately  certi- 
fied calibrant. 

Systematic  errors  may  arise  from 
such  sources  as  faulty  calibrations,  the 
use  of  erroneous  physical  constants, 
incorrect  computational  procedures, 
improper  units  for  measurement  or  re- 
porting data,  and  matrix  effects  on'the 
measurement.  Some  of  these  can  be 
eliminated  or  minimized  by  applying 
corrections  or  by  modification  of  the 
measurement  technique.  Others  may 
be  related  to  fundamental  aspects  of 
the  measurement  process.  The  most 
insidious  sources  of  error  are  those  un- 
known or  unsuspected  of  being 
present. 

One  of  the  most  important  sources 
of  error  in  modern  instrumental  mea- 
surements concerns  uncertainties  in 
the  calibrants  used  to  define  the  ana- 
lytical function  of  the  instrument.  The 
measurement  step  essentially  consists 
of  the  comparison  of  an  unknown  with 
a  known  (calibrant)  so  that  any  error 
in  the  latter  results  in  a  proportional 
error  in  the  former.  The  need  to  use 
calibrants  of  the  highest  reliability  is 
obvious. 

The  measurement  protocol  should 
include  a  detailed  analysis  of  the 
sources  of  error  and  correction  for 
them  to  the  extent  possible.  The 
uncertainties,  B,  referred  to  earlier, 
represent  the  uncertainties  in  the  cor- 
rections for  the  systematic  errors.  In 
making  such  an  estimate,  the  95%  con- 
fidence limits  should  be  assigned  to 
the  extent  possible.  The  magnitudes 
of  these  uncertainties  can  be  estimat- 
ed from  those  assigned  by  others  in 
the  case  of  such  factors  as  calibration 
standards  and  physical  constants. 
Other  constant  sources  of  error  may 
be  more  subtle  both  to  identify  and  to 
evaluate,  and  the  judgment  and  even 
intuition  of  the  experimenter  may  be 
the  only  sources  of  information. 

The  effectiveness  of  elimination  of, 
or  correction  for,  systematic  errors  is 
best  evaluated  from  external  quality 
assessment  procedures.  Differences 
found  between  known  and  measured 
values  of  test  samples,  such  as  SRMs, 
need  to  be  reconciled  with  the  labora- 
tory's own  estimates  of  bounds  for  its 
random  and  systematic  errors.  When 
the  random  error  is  well  established, 
as  by  the  quality  control  process,  sig- 
nificant discrepancies  can  be  attrib- 
uted to  unsuspected  or  incorrectly  es- 
timated systematic  errors. 

The  Use  of  SRMs  for  Quality 
Assessment 

An  SRM  is  a  material  for  which  the 
properties  and  composition  are  certi- 
fied by  the  National  Bureau  of  Stan- 
dards (6,  7).  To  the  extent  that  its 
compositional  properties  simulate 


Figure  4.  Typical  analytical  systematic 
errors  (bias),  (a)  =  unbiased;  (b)  = 
measurement-level  related;  (c)  =  con- 
stant error;  and  (d)  =  combination  of  b 
and  c 


those  of  the  sample  ordinarily  mea- 
sured, its  "correct"  measurement  can 
imply  "correct"  measurement  of  the 
usual  samples.  Such  a  conclusion  re- 
quires that  the  protocol  of  measure- 
ment was  the  same  in  each  case. 
Hence  it  is  necessary  that  no  special 
care  be  exercised  in  measuring  the 
SRM,  other  than  that  ordinarily  used. 

Analysis  of  SRMs  has  been  recom- 
mended as  a  means  of  providing  "trace- 
ability"  to  national  measurement 
standards.  However,  a  word  of  caution 
is  appropriate  on  this  point.  Measure- 
ment processes  are  seldom  identical, 
so  that  traceability  is  most  often  based 
on  inference.  Also,  the  fact  that  an  ac- 
ceptable result  is  or  is  not  obtained  for 
an  SRM  provides  no  unique  explana- 
tion for  such  a  result. 

The  use  of  an  SRM  should  never  be 
attempted  until  the  analytical  system 
has  been  demonstrated  to  be  in  a  state 
of  statistical  control.  An  SRM  is  not 
needed  for  such  a  purpose  and  such 
use  is  discouraged.  Ordinarily,  the 
SRM  will  be  available  in  limited 
amount  so  that  the  statistics  of  the 
measurement  process  should  be  dem- 
onstrated by  measurements  on  other 
materials.  Only  under  such  a  situation 
can  the  results  of  an  SRM  measure- 
ment be  considered  as  representative 
of  the  measurement  system. 

A  consideration  of  the  nature  of  an- 
alytical errors,  shown  in  Figure  4,  will 
clarify  why  the  measurement  of  a  sin- 
gle SRM  may  not  be  fully  informative. 
It  will  be  noted  that  errors  may  be 
constant,  measurement-level  related, 
or  a  combination  of  these,  and  a  single 
right  or  wrong  result  will  not  indicate 
on  which  of  several  possible  curves  it 
might  lie.  Measurement  of  a  series  of 
SRMs  may  clarify  the  nature  of  the 
measurement  process  and  this  should 
be  done  whenever  possible.  An  inti- 
mate understanding  of  the  operation 
of  a  particular  measurement  system 
may  also  make  it  possible  to  eliminate 
some  of  the  possible  sources  of  error 
and  to  better  interpret  the  data  from 
measurement  of  SRMs. 
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Record  Keeping 

Adequate  record  keeping  in  an  easi- 
ly retrievable  manner  is  an  essential 
part  of  the  quality  assurance  program. 
Records  needed  include  the  descrip- 
tion of  test  samples,  experimental  pro- 
cedures, and  data  on  calibration  and 
testing.  Quality  control  charts  should 
be  diligently  prepared  and  stored.  A 
chain  of  custody  of  test  materials 
should  be  operative  and  such  materi- 
als should  be  retained  and  safe- 
guarded until  there  is  no  doubt  about 
their  future  use  or  need. 
Data  Control 

The  evaluation,  review,  and  release 
of  analytical  data  is  an  important  part 
of  the  quality  assurance  process.  No 
data  should  be  released  for  external 
use  until  it  has  been  carefully  evalu- 
ated, Guidelines  for  data  evaluation, 
applicable  to  almost  every  analytical 
situation,  have  been  developed  by  the 
ACS  Committee  on  Environmental 
Improvement  (8).  A  prerequisite  for 
release  of  any  data  should  be  the  as- 
signment of  uncertainty  limits,  which 
requires  the  operation  of  some  kind  of 
a  quality  assurance  program.  Formal 
release  should  be  made  by  a  profes- 
sional analytical  chemist  who  certifies 
that  the  work  was  done  with  reason- 
able care  and  that  assigned  limits  of 
uncertainty  are  applicable. 

Laboratory  Accreditation 

Laboratory  accreditation  is  one 
form  of  quality  assurance  for  the  data 
output  of  certified  laboratories.  Ac- 
creditation is  based  on  criteria  that 
are  considered  essential  to  generate 
valid  data  and  is  a  formal  recognition 
that  the  laboratory  is  competent  to 
carry  out  a  specific  test  or  specific 
type  of  test  (9, 10).  The  certification  is 
as  meaningful  as  the  care  exercised  in 
developing  certification  criteria  and 
evaluating  laboratory  compliance. 
Generic  criteria  developed  by  national 
and  international  standardization  or- 
ganizations have  been  influential  in 
this  respect  (11).  These  criteria  are 
well  conceived  and  provide  general 
guidance  for  the  sound  operation  of 
analytical  laboratories,  whether  or  not 
certification  is  involved. 

Implementation 

Detailed  quality  assurance  plans  are 
ineffective  unless  there  is  commitment 
to  quality  by  all  concerned.  This  com- 
mitment must  be  total,  from  manage- 
ment to  technical  staff.  The  former 
must  provide  the  resources,  training, 
facilities,  equipment,  and  encourage- 
ment required  to  do  quality  work.  The 
latter  must  have  the  technical  ability 
and  motivation  to  produce  quality 
data.  Some  may  argue  that  if  there  is 
such  commitment,  there  is  no  need  for 
a  formal  quality  assurance  program. 


However,  the  experience  of  many  lab- 
oratories has  demonstrated  that  a  for- 
mal quality  assurance  program  pro- 
vides constant  guidance  for  the  attain- 
ment of  the  quality  goals  desired. 
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Appendix  D.2 


Sampling  for 
Chemical 
Analysis 


A  major  consideration  in  the  reli- 
ability of  any  analytical  measurement 
is  that  of  sample  quality.  Too  little  at- 
tention is  directed  to  this  matter.  The 
analyst  often  can  only  report  results 
obtained  on  the  particular  test  speci- 
men at  the  moment  of  analysis,  which 
may  not  provide  the  information  de- 
sired or  needed.  This  may  be  because 
of  uncertainties  in  the  sampling  pro- 
cess, or  in  sample  storage,  preserva- 
tion, or  pretreatment  prior  to  analysis. 
The  sampling  plan  itself  is  often  so 
poorly  considered  as  to  make  relation 
of  the  analytical  results  to  the  popula- 
tion from  which  the  sample  was  drawn 
uncertain,  or  even  impossible  to  inter- 
pret. 

All  of  the  above  aspects  of  sampling 
merit  full  consideration  and  should  be 
addressed  in  every  analytical  determi- 
nation. Because  the  scope  is  so  broad, 
We  will  limit  the  present  discussion  to 
a  small  segment  of  the  total  problem, 
that  of  sampling  bulk  materials.  For 
such  materials  the  major  steps  in  sam- 
pling are: 

•  identification  of  the  population 
from  which  the  sample  is  to  be  ob- 
tained, 

•  selection  and  withdrawal  of  valid 
gross  samples  of  this  population,  and 

•  reduction  of  each  gross  sample  to  a 
laboratory  sample  suitable  for  the  an- 
alytical techniques  to  be  used. 


The  analysis  of  bulk  materials  is 
one  of  the  major  areas  of  analytical  ac- 
tivity. Included  are  such  problems  as 
the  analysis  of  minerals,  foodstuffs, 
environmentally  important  sub- 
stances, and  many  industrial  products. 
We  shall  discuss  the  major  consider- 
ations in  designing  sampling  programs 
for  such  materials.  While  our  discus- 
sion is  specifically  directed  toward 
solid  materials,  extension  to  other 
materials  will  often  be  obvious. 

A  brief  list  of  definitions  commonly 
used  in  bulk  sampling  is  provided  in 
the  glossary. 

Preliminary  Considerations  in 
Sampling 

Poor  analytical  results  may  be 
caused  in  many  ways — contaminated 
reagents,  biased  methods,  operator  er- 
rors in  procedure  or  data  handling, 
and  so  on.  Most  of  these  sources  of 
error  can  be  controlled  by  proper  use 
of  blanks,  standards,  and  reference 
samples.  The  problem  of  an  invalid 
sample,  however,  is  special;  neither 
control  nor  blank  will  avail.  Accord- 
ingly, sampling  uncertainty  is  often 
treated  separately  from  other  uncer- 
tainties in  an  analysis.  For  random  er- 
rors the  overall  standard  deviation,  sQ, 
is  related  to  the  standard  deviation  for 
the  sampling  operation,  ss,  and  to  that 


for  the  remaining  analytical  opera- 
tions, s„,  by  the  expression:  s%  =  s„  + 
Sg-  Whenever  possible,  measurements 
should  be  conducted  in  such  a  way 
that  the  components  of  variance  aris- 
ing from  sample  variability  and  mea- 
surement variability  can  be  separately 
evaluated.  If  the  measurement  process 
is  demonstrated  to  be  in  a  state  of  sta- 
tistical control  so  that  sa  is  already 
known,  ss  can  be  evaluated  from  s0, 
found  by  analysis  of  the  samples.  Oth- 
erwise, an  appropriate  series  of  repli- 
cate measurements  or  replicate  sam- 
ples can  be  devised  to  permit  evalua- 
tion of  both  standard  deviations. 

Youden  has  pointed  out  that  once 
the  analytical  uncertainty  is  reduced 
to  a  third  or  less  of  the  sampling  un- 
certainty, further  reduction  in  the  an- 
alytical-uncertainty is  of  little  impor- 
tance (/ ).  Therefore,  if  the  sampling 
uncertainty  is  large  and  cannot  be  re- 
duced, a  rapid,  approximate  analytical 
method  may  be  sufficient,  and  further 
refinements  in  the  measurement  step 
may  be  of  negligible  aid  in  improving 
the  overall  results.  In  fact,  in  such 
cases  a  rapid  method  of  low  precision 
that  permits  more  samples  to  be  ex- 
amined may  be  the  best  route  to  re- 
ducing the  uncertainty  in  the  average 
value  of  the  bulk  material  under  test. 

An  excellent  example  of  the  impor- 
tance of  sampling  is  given  in  the  deter- 
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Figure  1.  Relative  standard  deviation 
associated  with  the  sampling  and 
analysis  operations  in  testing  peanuts 
for  aflatoxins  (after  T.  B.  Whittaker, 
Pure  and  Appl.  Chem.,  49,  1709  (1977)) 
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mination  of  aflatoxins  in  peanuts  (2). 
The  aflatoxins  are  highly  toxic  com- 
pounds produced  by  molds  that  grow 
best  under  warm,  moist  conditions. 
Such  conditions  may  be  localized  in  a 
warehouse,  resulting  in  a  patchy  dis- 
tribution of  highly  contaminated  ker- 
nels. One  badly  infected  peanut  can 
contaminate  a  relatively  large  lot  with 
unacceptable  levels  (above  about  25 
ppb  for  human  consumption)  of  afla- 
toxins after  grinding  and  mixing.  The 
standard  deviations  of  the  three  oper- 
ations of  sampling,  subsampling,  and 
analysis  are  shown  in  Figure  1.  The 
analytical  procedure  consists  of  sol- 
vent extraction  followed  by  thin-layer 
chromatography  and  measurement  of 
the  fluorescence  of  the  aflatoxin  spots. 
Clearly,  sampling  is  the  major  source 
of  the  analytical  uncertainty. 

Types  of  Samples 

Random  Samples.  In  common  with 
the  statistician,  the  analytical  chemist 
ordinarily  wishes  to  generalize  from  a 
small  body  of  data  to  a  larger  body  of 
data.  While  the  specimen/sample  ac- 
tually examined  is  sometimes  the  only 
matter  of  interest,  the  characteristics 
of  the  population  of  specimens  are  fre- 
quently desired.  Obviously,  the  sam- 
ples under  examination  must  not  be 
biased,  or  any  inferences  made  from 
them  will  likewise  be  biased. 


Statisticians  carefully  define  several 
terms  that  are  applied  to  statistical  in- 
ference. The  target  population  de- 
notes the  population  to  which  we 
would  like  our  conclusions  to  be  appli- 
cable, while  the  parent  population 
designates  that  from  which  samples 
were  actually  drawn.  In  practice  these 
two  populations  are  rarely  identical, 
although  the  difference  may  be  small. 
This  difference  may  be  minimized 
when  the  selection  of  portions  for  ex- 
amination is  done  by  a  random  pro- 
cess. In  such  a  process  each  part  of  the 
population  has  an  equal  chance  of 
being  selected.  Thus,  random  samples 
are  those  obtained  by  a  random  sam- 
pling process  and  form  a  foundation 
from  which  generalizations  based  on 
mathematical  probability  can  be 
made. 

Random  sampling  is  difficult.  A 
sample  selected  haphazardly  is  not  a 
random  sample.  On  the  other  hand, 
samples  selected  by  a  defined  protocol 
are  likely  to  reflect  the  biases  of  the 
protocol.  Even  under  the  most  favor- 
able circumstances,  unconscious  selec- 
tion and  biases  can  occur.  Also,  it  can 
be  difficult  to  convince  untrained  in- 
dividuals assigned  t  he  task  of  ob- 
taining samples  that  an  apparently 
unsystematic  collection  pattern  must 
be  followed  closely  for  it  to  be  valid. 

Whenever  possible,  the  use  of  a 


table  of  random  numbers  is  recom- 
mended as  an  aid  to  sample  selection. 
The  bulk  material  is  divided  into  a 
number  of  real  or  imaginary  segments. 
For  example,  a  body  of  water  can  be 
conceptually  subdivided  into  cells, 
both  horizontally  and  vertically,  and 
the  cells  to  be  sampled  selected  ran- 
domly. To  do  this  each  segment  is  as- 
signed a  number,  and  selection  of  seg- 
ments from  which  sample  increments 
are  to  be  taken  is  made  by  starting  in 
an  arbitrary  place  in  a  random  num- 
ber table  and  choosing  numbers  ac- 
cording to  a  predecided  pattern.  For 
example,  one  could  choose  adjacent, 
alternate,  or  nth  entries  and  sample 
those  segments  whose  numbers  occur 
until  all  of  the  samples  decided  upon 
have  been  obtained. 

The  results  obtained  for  these  and 
other  random  samples  can  be  analyzed 
by  some  model  or  plan  to  identify 
whether  systematic  relations  exist. 
This  is  important  because  of  the  possi- 
ble introduction  of  apparent  correla- 
tions due  to  systematic  trends  or  bias- 
es in  the  measurement  process.  Ac- 
cordingly, measurement  plans  should 
always  be  designed  to  identify  and 
minimize  such  problems. 

Despite  the  disadvantages,  sam- 
pling at  evenly  spaced  intervals  over 
the  bulk  is  still  often  used  in  place  of 
random  sampling  owing  to  its  simplic- 
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Define  goals. 


Select  analytical  procedures,  number  of 
analyses,  and  sampling  sites  on  basis  of 
goals,  time  and  cost  constraints,  and 
personnel  and  apparatus  available. 


Collect  samples;  reduce  to  suitable  test 
portions. 


Carry  out  preliminary  operations  (dissolve, 
ad|ust  conditions,  separate  interferences); 
acquire  data  on  test  portions. 


Select  best  value  from  data,  estimate 
reliability  of  value,  assess  validity  of  model, 
revise  model  and  repeat  if  necessary. 


Figure.2.  The  place  of  sampling  in  the  overall  analytical  process 


ity.  Because  this  procedure  is  more 
subject  to  bias  than  random  sampling, 
it  is  not  recommended.  If  it  is  used, 
the  results  must  be  closely  monitored 
to  ensure  that  errors  from  periodicity 
in  the  material  are  not  introduced. 

Systematic  Samples.  Frequently, 
samples  are  obtained  and  analyzed  to 
reflect  or  test  some  systematic  hy- 
pothesis, such  as  changes  in  composi- 
tion with  time,  temperature,  or  spatial 
location.  Such  samples,  if  collected  in 
a  systematic  manner,  may  each  be 
considered  to  represent  a  separate  dis- 
crete population  under  the  existing 
conditions.  However,  the  results  may 
still  be  statistically  tested  for  the  sig- 
nificance of  any  apparent  differences. 

In  a  carefully  designed  sampling 
plan,  consideration  should  be  given  to 
the  possible  concurrence  of  unantici- 
pated events  or  phenomena  that  could 
prejudice  the  information  on  the  sam- 
ple measured.  For  example,  measure- 
ments to  be  taken  at  time  intervals  are 
sometimes  made  with  a  random  start 
or  other  superimposed  random  time 


element.  Needless  to  say,  the  less 
known  about  a  given  process,  the  more 
randomness  is  merited.  Conversely,  as 
a  process  is  more  fully  understood, 
systematic  approaches  can  provide 
maximum  efficiency  of  data  acquisi- 
tion. 

Representative  Samples.  The 

term  "representative  sample"  is  fre- 
quently used  in  analytical  discussions 
to  connote  a  single  sample  of  a  uni- 
verse or  population  (e.g.,  waste  pile, 
lagoon,  ground  water)  that  can  be  ex- 
pected to  exhibit  average  properties  of 
the  population  (see  glossary).  Ob- 
viously, such  a  sample  cannot  be  se- 
lected by  a  random  process.  And  even 
if  it  could,  to  ascertain  the  validity  of 
its  representativeness  would  require 
considerable  effort. 

The  concept  of  a  truly  representa- 
tive sample  would  appear  to  be  valid 
in  only  two  cases.  The  first  case  in- 
volves samples  defined  a  priori  as  rep- 
resentative for  a  specific  purpose.  For 
example,  the  Hazardous  Waste  Man- 
agement System  prescribes  seven  pro- 


.ocols  for  sampling  wastes — ranging 
from  viscous  liquids,  solids,  or  con- 
tainerized liquids  to  reservoirs — to 
provide  samples  that  "will  be  consid- 
ered by  the  Agency  (EPA)  to  be  repre- 
sentative of  the  waste"  (.3).  The  sec- 
ond case  involves  the  sampling  of 
truly  homogeneous  materials. 

While  the  measurement  of  samples 
defined  as  representative  may  reduce 
analytical  costs,  the  information  so 
obtained  ordinarily  does  not  enjoy  the 
status  of  that  obtained  from  valid  ran- 
dom samples  of  the  population.  An  ex- 
ception is  when  effort  has  been  vigor- 
ously exerted  to  homogenize  the  popu- 
lation prior  to  sampling.  Such  pro- 
cesses are  difficult  and  are  ordinarily 
only  justified  when  the  objective  is  to 
produce  a  number  of  subsamples  of 
essentially  similar  properties. 

Because  of  the  difficulties  of  se- 
lecting or  producing  a  "representative 
sample"  it  is  recommended  that  this 
concept  be  discouraged  for  general 
purposes  and  reserved  only  for  cases 
where  the  effort  required  to  prepare 
such  a  sample  is  justified.  An  appre- 
ciation of  the  compositional  informa- 
tion that  is  lost  as  a  result  is  a  further 
reason  to  discourage  the  practice. 
With  a  properly  designed  and  execut- 
ed random  sampling  plan,  the  valu- 
able characteristics  of  sample  mean 
and  variation  between  members  can 
be  ascertained,  neither  of  which  can 
be  obtained  by  measurement  of  one 
"representative  sample." 

Composite  Samples.  A  composite 
sample  (see  glossary)  may  be  consid- 
ered as  a  special  way  of  attempting  to 
produce  a  representative  sample. 
Many  sampling  procedures  are  based 
on  the  assumption  that  average  com- 
position is  the  only  information  de- 
sired. Such  averages  may  be  bulk  av- 
erages, time-weighted  averages,  and 
flow-proportional  averages,  for  exam- 
ple, and  may  be  obtained  by  measure- 
ment of  a  composite,  suitably  pre- 
pared or  collected.  Elaborate  proce- 
dures involving  crushing,  grinding, 
mixing,  and  blending  have  been  devel- 
oped and  even  standardized  for  the 
preparation  of  solid  composites,  while 
sampling  systems  for  liquids  (especial- 
ly water)  have  been  developed  to  ob- 
tain various  composite  samples. 

Analysis  of  a  number  of  individual 
samples  permits  determination  of  the 
average  (at  the  expense  of  extra  ana- 
lytical effort)  and  the  distribution  of 
samples  within  the  population  (be- 
tween-sample variability).  In  some 
cases,  it  may  be  of  interest  to  isolate 
the  within-sample  variability  as  well. 
All  this  information  is  necessary  for 
collaborative  test  samples  and  in  ref- 
erence material  usage,  especially  when 
apparent  differences  in  analytical  re- 
sults within  and  between  laboratories 
need  to  be  evaluated. 
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Table  I.  Confidence  Intervals  and  Statistical  Tolerance  Limits3 


n  t»  -7=  Kc  Ks 


2  12.70  ±18  37.67  ±75 

4  3.18  ±3.2  6.37  ±12.9 

8  2.36  ±1.7  3.73  ±7.4 

16  2.18  ±1.1  2.90  ±5.8 

32  2.04  ±0.7  2.50  ±5.0 

100  1.98  ±0.4  2.23  ±4.4 

1.96                      0  1.96  ±4.0 


8  Calculated  for  s  =  2,  based  on  measurement  of  n  samples 
b  95%  confidence  limits  for  the  mean  of  n  samples 

c  Based  on  a  95%  confidence  that  the  interval  will  contain  95%  of  the  samples 


Because  of  the  limited  information 
provided  by  a  composite  sample,  full 
consideration  should  be  given  to  the 
consequences  before  deciding  between 
this  approach  and  the  analysis  of  indi- 
vidual samples. 

Subsampling.  Usually,  the  sample 
received  by  the  analytical  laboratory 
will  be  larger  than  that  required  for  a 
single  measurement,  so  some  sub- 
sampling  (see  glossary)  will  be  re- 
quired. Often,  test  portions  (see  glos- 
sary) must  be  taken  for  replicate  mea- 
surements or  for  measurement  of  dif- 
ferent constituents  by  several  tech- 
niques. Obviously,  such  test  portions 
must  be  sufficiently  alike  that  the  re- 
sults are  compatible.  Frequently  it  is 
necessary  to  reduce  particle  size,  mix, 
or  otherwise  process  the  laboratory 
sample  (see  glossary)  before  with- 
drawing portions  (subsamples)  for 
analysis.  The  effort  necessary  at  this 
stage  depends  on  the  degree  of  homo- 
geneity of  the  original  sample.  In  gen- 
eral, the  subsampling  standard  devia- 
tion should  not  exceed  one-third  of 
the  sampling  standard  deviation.  Al- 
though this  may  sound  appreciable,  it 
is  wasteful  of  time  and  effort  to  de- 
crease it  below  this  level.  But  this  does 
not  mean  care  is  unnecessary  in  sub- 
sampling.  If  a  sample  is  already  homo- 
geneous, care  may  be  needed  to  avoid 
introducing  segregation  during  sub- 
sampling.  Even  though  analysts  may 
not  be  involved  with  sample  collec- 
tion, they  should  have  sufficient 
knowledge  of  sampling  theory  to  sub- 
sample  properly.  They  should  also  be 
provided  with  any  available  informa- 
tion on  the  homogeneity  of  the  sam- 
ples received  so  that  they  can  subsam- 
ple  adequately  and  efficiently. 

Model  of  the  Sampling  Operation 

Before  sampling  is  begun,  a  model 
of  the  overall  operation  should  be  es- 
tablished (Figure  2).  The  model 
should  consider  the  population  to  be 
studied,  the  substance(s)  to  be  mea- 
sured, the  extent  to  which  speciation 
is  to  be  determined,  the  precision  re- 
quired, and  the  extent  to  which  the 
distribution  of  the  substance  within 
the  population  is  to  be  obtained. 

The  model  should  identify  all  as- 
sumptions made  about  the  population 
under  study.  Once  the  model  is  com- 
plete, a  sampling  plan  can  be  estab- 
lished. 

The  Sampling  Plan 

The  plan  should  include  the  size, 
number,  and  location  of  the  sample  in- 
crements and,  if  applicable,  the  extent 
of  compositing  to  be  done.  Procedures 
for  reduction  of  the  gross  sample  (see 
glossary)  to  a  laboratory  sample,  and 
to  the  test  portions,  should  be  speci- 
fied. All  of  this  should  be  written  as  a 


detailed  protocol  before  work  is 
begun.  The  protocol  should  include 
procedures  for  all  steps,  from  sam- 
pling through  sample  treatment,  mea- 
surement, and  data  evaluation;  it 
should  be  revised  as  necessary  during 
execution  as  new  information  is  ob- 
tained. The  guidelines  for  data  acqui- 
sition and  quality  evaluation  in  envi- 
ronmental chemistry  set  out  by  the 
ACS  Subcommittee  on  Environmental 
Analytical  Chemistry  are  sufficiently 
general  to  be  recommended  reading 
for  workers  in  all  fields  (4). 

The  sampling  protocol  should  in- 
clude details  of  when,  where,  and  how 
the  sample  increments  are  to  be  taken. 
On-site  criteria  for  collection  of  a  valid 
sample  should  be  established  before- 
hand. Frequently,  decisions  must  be 
made  at  the  time  of  sampling  as  to 
components  likely  to  appear  in  the 
sample  that  may  be  considered  for- 
eign, that  is,  not  part  of  the  popula- 
tion. For  example,  a  portion  of 
dredged  sediment  in  which  the  mercu- 
ry content  is  to  be  determined  might 
contain  cans,  discarded  shoes,  rocks  or 
other  extraneous  material.  For  the  in- 
formation sought  these  items  might  be 
considered  foreign  and  therefore  legit- 
imately rejected.  Decisions  as  to  rejec- 
tion become  less  clear  with  smaller 
items.  Should  smaller  stones  be  reject- 
ed? How  small?  And  what  about  bits 
of  metal,  glass,  leather,  and  so  on?  Cri- 
teria for  such  decisions  should  be 
made  logically  and  systematically,  if 
possible  before  sampling  is  initiated. 

The  type  of  container,  cleaning  pro- 
cedure, and  protection  from  contami- 
nation before  and  after  sampling  must 
be  specified.  The  question  of  sample 
preservation,  including  possible  addi- 
tion of  preservatives  and  ref  rigeration, 
should  be  addressed.  Some  sampling 
plans  call  for  field  blanks  and/or  field- 
spiked  samples.  The  critical  nature  of 
the  latter  and  the  dif  ficulties  possible 
under  field  conditions  require  the  ut- 
most care  in  planning  and  execution  of 


the  sampling  operation  if  the  results 
are  to  be  meaningful 

Whenever  possible,  the  analyst 
should  perform  or  directly  supervise 
the  sampling  operation.  If  this  is  not 
feasible,  a  written  protocol  should  be 
provided  and  the  analyst  should  en- 
sure that  those  collecting  the  samples 
are  well-trained  in  the  procedures  and 
in  use  of  the  sampling  equipment,  so 
that  bias  and  contamination  are  mini- 
mized. No  less  important  is  careful  la- 
beling and  recording  of  samples.  A 
chain  of  custody  should  be  established 
such  that  the  integrity  of  the  samples 
from  source  to  measurement  is  en- 
sured. Often  auxiliary  data  must  be 
recorded  at  the  time  the  sample  is 
taken:  temperature,  position  of  the 
collecting  probe  in  the  sample  stream, 
flow  velocity  of  the  stream,  and  so  on. 
Omission  or  loss  of  such  information 
may  greatly  decrease  the  value  of  a 
sample,  or  even  render  it  worthless. 

Sampling  Bulk  Materials.  Once 
the  substances  to  be  determined,  to- 
gether with  the  precision  desired,  have 
been  specified,  the  sampling  plan  can 
be  designed.  In  designing  the  plan,  one 
must  consider: 

•  How  many  samples  should  be 
taken? 

•  How  large  should  each  be? 

•  From  where  in  the  bulk  material 
(population)  should  they  be  taken? 

•  Should  individual  samples  be  ana- 
lyzed, or  should  a  composite  be  pre- 
pared? 

These  questions  cannot  be  an- 
swered accurately  without  some 
knowledge  of  the  relative  homogeneity 
of  the  system.  Gross  samples  should 
be  unbiased  with  respect  to  the  differ- 
ent sizes  and  types  of  particles  present 
in  the  bulk  material.  The  size  of  the 
gross  sample  is  often  a  compromise 
based  on  the  heterogeneity  of  the  bulk 
material  on  the  one  hand,  and  the  cost 
of  the  sampling  operation  on  the 
other. 

When  the  properties  of  a  material 
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Figure  3.  Sampling  diagram  of  sodium-24  in  human  liver  homogenate  (from  Refer- 
ence 7) 


to  be  sampled  are  unknown,  a  good 
approach  is  to  collect  a  small  number 
of  samples,  using  experience  and  intu- 
ition as  a  guide  to  making  them  as 
representative  of  the  population  as 
possible,  and  analyze  for  the  compo- 
nent of  interest.  From  these  prelimi- 
nary analyses,  the  standard  deviation 
ss  of  the  individual  samples  can  be 
calculated,  and  confidence  limits  for 
the  average  composition  can  be  estab- 
lished using  the  relation 

fi=x±  tSs/y/n  (1) 

where  ju  is  the  true  mean  value  of  the 
population,  x  is  the  average  of  the  an- 
alytical measurements,  and  t  is  ob- 
tained from  statistical  tables  for  n 
measurements  (often  given  as  n  —  1 
degrees  of  freedom)  at  the  desired 
level  of  confidence,  usually  95%.  Table 
I  lists  some  t  values;  more  extensive 
tables  are  provided  in  books  on  quan- 
titative analysis  and  statistics  (5). 

On  the  basis  of  this  preliminary  in- 
formation, a  more  refined  sampling 
plan  can  be  devised,  as  described  in 
the  following  sections.  After  one  or 
two  cycles  the  parameters  should  be 
known  with  sufficient  confidence  that 
the  optimum  size  and  number  of  the 
samples  can  be  estimated  with  a  high 
level  of  confidence.  The  savings  in 
sampling  and  analytical  time  and 
costs  by  optimizing  the  sampling  pro- 
gram can  be  considerable. 

Minimum  Size  of  Individual  In- 
crements. Several  methods  have  been 
developed  for  estimation  of  the 
amount  of  sample  that  should  be 
taken  in  a  given  increment  so  as  not  to 
exceed  a  predetermined  level  of  sam- 


pling uncertainty.  One  approach  is 
through  use  of  Ingamells's  sampling 
constant  (6).  Based  on  the  knowledge 
that  the  between-sample  standard  de- 
viation ss  (Equation  1),  decreases  as 
the  sample  size  is  increased,  Ingamells 
has  shown  that  the  relation 


WR2  =  Ks 


(2) 


is  valid  in  many  situations.  In  Equa- 
tion 2,  W  represents  the  weight  of 
sample  analyzed,  R  is  the  relative 
standard  deviation  (in  percent)  of 
sample  composition,  and  Ks  is  the 
sampling  constant,  corresponding  to 
the  weight  of  sample  required  to  limit 
the  sampling  uncertainty  to  1%  with 
68%  confidence.  The  magnitude  of  Ks 
may  be  determined  by  estimating  ss 
from  a  series  of  measurements  of  sam- 
ples of  weight  W. 

Once  Ks  is  evaluated  for  a  given 
sample,  the  minimum  weight  W  re- 
quired for  a  maximum  relative  stan- 
dard deviation  of  R  percent  can  be 
readily  calculated. 

An  example  of  an  Ingamells  sam- 
pling constant  diagram  is  shown  in 
Figure  3  for  a  human  liver  sample 
under  study  in  the  National  Environ- 
mental Specimen  Bank  Pilot  Program 
at  the  National  Bureau  of  Standards 
(NBS)  in  conjunction  with  the  Envi- 
ronmental Protection  Agency  (7).  A 
major  goal  of  the  program  is  to  evalu- 
ate specimen  storage  under  different 
conditions.  This  requires  analysis  of 
small  test  portions  of  individual  liver 
specimens.  The  material  must  be  suf- 
ficiently homogeneous  that  variability 
between  test  portions  does  not  mask 
small  variations  in  composition  owing 


to  changes  during  storage.  The  homo- 
geneity of  a  liver  sample  for  sodium 
was  assessed  by  a  radiotracer  study  in 
which  a  portion  was  irradiated,  added 
to  the  remainder  of  the  specimen,  and 
the  material  homogenized.  Several 
test  portions  were  then  taken  and  the 
activity  of  24Na  measured  as  an  indi- 
cator of  the  distribution  of  sodium  in 
the  samples.  From  Figure  3  it  can  be 
seen  that  the  weight  of  sample  re- 
quired to  yield  an  inhomogeneity  of 
1%  (±2.4  counts  g_1s_1)  is  about  35  g. 
For  a  subsample  of  one  gram,  a  sam- 
pling uncertainty  of  about  5%  can  be 
expected. 

Minimum  Number  of  Individual 
Increments.  Unless  the  population  is 
known  to  be  homogeneous,  or  unless  a 
representative  sample  is  mandated  by 
some  analytical  problem,  sufficient 
replicate  samples  (increments)  must 
be  analyzed.  To  determine  the  mini- 
mum number  of  sample  increments,  a 
sampling  variance  is  first  obtained,  ei- 
ther from  previous  information  on  the 
bulk  material  or  from  measurements 
made  on  the  samples.  The  number  of 
samples  necessary  to  achieve  a  given 
level  of  confidence  can  be  estimated 
from  the  relation 

t2s2 

where  f  is  the  student's  t -table  value 
for  the  level  of  confidence  desired,  s2 
and  x  are  estimated  from  preliminary 
measurements  on  or  from  previous 
knowledge  of  the  bulk  material,  and  R 
is  the  percent  relative  standard  devia- 
tion acceptable  in  the  average.  Initial- 
ly t  can  be  set  at  1.96  for  95%  confi- 
dence limits  and  a  preliminary  value 
of  n  calculated.  The  t  value  for  this  n 
can  then  be  substituted  and  the  sys- 
tem iterated  to  constant  n.  This  ex- 
pression is  applicable  if  the  sought-for 
component  is  distributed  in  a  positive 
binomial,  or  a  Gaussian,  distribution. 
Such  distributions  are  characterized 
by  having  an  average,  n,  that  is  larger 
than  the  variance,  a\.  Remember  that 
values  of  <ts  (and  ss)  may  depend 
greatly  on  the  size  of  the  individual 
samples. 

Two  other  distributions  that  may  be 
encountered,  particularly  in  biological 
materials,  should  be  mentioned.  One 
is  the  Poisson  distribution,  in  which 
the  sought-for  substance  is  distributed 
randomly  in  the  bulk  material  such 
that  a2  is  approximately  equal  to  m-  In 
this  case 


R2x 


(4) 


The  other  is  the  negative  binominal 
distribution,  in  which  the  sought-for 
substance  occurs  in  clumps  or  patches, 
and  a;  is  larger  than  fi.  This  pattern 
often  occurs  in  the  spread  of  contami- 
nation or  contagion  from  single 
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sources,  and  is  characterized  by  two 
factors,  the  average,  x,  and  a  term,  k, 
called  the  index  of  clumping.  P'or  this 
system 


1 


(5) 


Here  k  must  be  estimated,  along  with 
x,  from  preliminary  measurements  on 
the  system. 

Sometimes,  what  is  wanted  is  not  an 
estimate  of  the  mean  but  instead  the 
two  outer  values  or  limits  that  contain 
nearly  all  of  the  population  values.  If 
we  know  the  mean  and  standard  de- 
viation, then  the  intervals  fi  ±  2a  and 
fi  ±  3<t  contain  95%  and  99.7%,  respec- 
tively, of  all  samples  in  the  popula- 
tion. Ordinarily,  the  standard  devia- 
tion a  is  not  known  but  only  its  esti- 
mate s,  based  on  n  observations.  In 
this  case  we  may  calculate  statistical 
tolerance  limits  of  the  form  x  +  Ks 
and  x  -  Ks,  with  the  factor  K  chosen 
so  that  we  may  expect  the  limits  to  in- 
clude at  least  a  fraction  P  of  the  sam- 
ples with  a  stated  degree  of  confi- 
dence. Values  for  the  factor  K  (8)  de- 
pend upon  the  probability  7  of  includ- 
ing the  proportion  P  of  the  popula- 
tion, and  the  sample  size,  n.  Some  val- 
ues of  K  are  given  in  Table  I.  For  ex- 
ample, when  7  =  0.95  and  P  =  0.95, 
then  K  =  3.38  when  n  =  10,  and  K  = 
37.67  for  duplicates  (n  =  2). 

Sampling  a  Segregated  (Strati- 
fied) Material.  Special  care  must  be 
taken  when  assessing  the  average 
amount  of  a  substance  distributed 
throughout  a  bulk  material  in  a  non- 
random  way.  Such  materials  are  said 
to  be  segregated.  Segregation  may  be 
found/for  example,  in  ore  bodies,  in 
different  production  batches  in  a 
plant,  or  in  samples  where  settling  is 
caused  by  differences  in  particle  size 
or  density. 

The  procedure  for  obtaining  a  valid 
sample  of  a  stratified  material  is  as 
follows  (9): 

•  Based  on  the  known  or  suspected 
pattern  of  segregation,  divide  the  ma- 
terial to  be  sampled  into  real  or  imagi- 
nary segments  (strata). 

•  Further  divide  the  major  strata  into 
real  or  imaginary  subsections  and  se- 
lect the  required  number  of  samples 
by  chance  (preferably  with  the  aid  of  a 
table  of  random  numbers). 

•  If  the  major  strata  are  not  equal  in 
size,  the  number  of  samples  taken 
from  each  stratum  should  be  propor- 
tional to  the  size  of  the  stratum. 

In  general,  it  is  better  to  use  strati- 
fied random  sampling  rather  than  un- 
restricted random  sampling,  provided 
the  number  of  strata  selected  is  not  so 
large  that  only  one  or  two  samples  can 
be  analyzed  from  each  stratum.  By 
keeping  the  number  of  strata  suffi- 
ciently small  that  several  samples  can 
be  taken  from  each,  possible  varia- 
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Figure  4.  Relation  between  minimum  sample  size  and  fraction  of  the  richer  parti- 
cles in  a  mixture  of  two  types  of  spherical  particles  (diameter  0.1  mm  and  density 
1)  for  a  sampling  standard  deviation  (R)  of  (a)  0.1  %  and  (b)  1  %.  Richer  particles 
contain  10%  of  substance  of  interest,  and  leaner  ones  contain  0,  1,5,  or  9% 
(after  Reference  12,  p  554) 


tions  within  the  parent  population  can 
be  detected  and  assessed  without  in- 
creasing the  standard  deviation  of  the 
sampling  step. 

Minimum  Number  of  Individual 
Increments.  When  a  bulk  material  is 
highly  segregated,  a  large  number  of 
samples  must  be  taken  from  different 
segments.  A  useful  guide  to  estimating 
the  number  of  samples  to  be  collected 
is  given  by  Visman  (10),  who  proposed 
that  the  variance  in  sample  composi- 
tion depends  on  the  degree  of  homoge- 
neity within  a  given  sample  increment 
and  the  degree  of  segregation  between 
sample  increments  according  to  the 
relation 


A/W+B/n 


(6) 


where  s.,  is  the  variance  of  the  average 
of  n  samples  using  a  total  weight  W  of 
sample,  and  A  and  R  are  constants  for 
a  given  bulk  material.  A  is  called  a  ho- 


mogeneity constant,  and  can  be  calcu- 
lated from  Ingamells's  sampling  con- 
stant and  the  average  composition  by 
A  =  lQ4x2Ks  (7) 
Sampling  Materials  in  Discrete 
Units.  If  the  lot  of  material  under 
study  occurs  in  discrete  units,  such  as 
truckloads,  drums,  bottles,  tank  cars, 
or  the  like,  the  variance  of  the  analyti- 
cal result  is  the  sum  of  three  contribu- 
tions: (1)  that  from  the  variance  be- 
tween units  in  the  lot,  (2)  that  from 
the  average  variance  of  sets  of  samples 
taken  from  within  one  unit,  and  (3) 
that  from  the  variance  of  the  analyti- 
cal operations.  The  contribution  from 
each  depends  upon  the  number  of 
units  in  the  lot  and  the  number  of 
samples  taken  according  to  the  fol- 
lowing relation  (9): 

9     cb2  (N  -  nb)  ,  ffu.2 


N 


(H) 
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Glossary 

Bulk  sampling — sampling  of  a  material  that  does  not  consist  of  discrete,  identifiable, 
constant  units,  but  rather  of  arbitrary,  irregular  units. 

Composite — a  sample  composed  of  two  or  more  Increments. 

Gross  sample  (also  called  bulk  sample,  lot  sample) — one  or  more  increments  of  material 
taken  from  a  larger  quantity  (lot)  of  material  for  assay  or  record  purposes. 

Homogeneity — the  degree  to  which  a  property  or  substance  is  randomly  distributed 
throughout  a  material.  Homogeneity  depends  on  the  size  of  the  units  under  consider- 
ation. Thus  a  mixture  of  two  minerals  may  be  inhomogeneous  at  the  molecular  or 
atomic  level,  but  homogeneous  at  the  particulate  level. 

Increment — an  individual  portion  of  material  collected  by  a  single  operation  of  a  sampling 
device,  from  parts  of  a  lot  separated  in  time  or  space.  Increments  may  be  either  tested 
individually  or  combined  (composited)  and  tested  as  a  unit. 

Individuals — conceivable  constituent  parts  of  the  population. 

Laboratory  sample — a  sample,  intended  for  testing  or  analysis,  prepared  from  a  gross 
sample  or  otherwise  obtained.  The  laboratory  sample  must  retain  the  composition 
of  the  gross  sample.  Often  reduction  in  particle  size  is  necessary  in  the  course  of  re- 
ducing'the  quantity. 

Lot — a  quantity  of  bulk  material  of  similar  composition  whose  properties  are  under 
study'. 

Population — a  generic  term  denoting  any  finite  or  infinite  collection  of  individual  things, 
objects,  or  events  in  the  broadest  concept;  an  aggregate  determined  by  some  property 
that  distinguishes  things  that  do  and  do  not  belong. 

Reduction — the  process  of  preparing  one  or  more  subsamples  from  a  sample. 

Sample — a  portion  of  a  population  or  lot.  It  may  consist  of  an  individual  or  groups  of  in- 
dividuals. 

Segment — a  specifically  demarked  portion  of  a  lot,  either  actual  or  hypothetical. 

Strata — segments  of  a  lot  that  may  vary  with  respect  to  the  property  under  study. 

Subsample — a  portion  taken  from  a  sample.  A  laboratory  sample  may  be  a  subsample 
of  a  gross  sample;  similarly,  a  test  portion  may  be  a  subsample  of  a  laboratory 
sample. 

Test  portion  (also  called  specimen,  test  specimen,  test  unit,  aliquot) — That  quantity  of 
a  material  of  proper  size  for  measurement  of  the  property  of  interest.  Test  portions 
may  be  taken  from  the  gross  sample  directly,  but  often  preliminary  operations,  such 
as  mixing  or  further  reduction  in  particle  size,  are  necessary. 


where 

ax  2  =  variance  of  the  mean, 
oV2  =  variance  of  the  units  in  the 
lot, 

aw2  =  average  variance  of  the 

samples  taken  from  a 

segment, 
<it2  =  variance  of  the  analytical 

operations, 
N    =  number  of  units  in  the  lot, 
rib    =  number  of  randomly 

selected  units  sampled, 
nw   =  number  of  randomly  drawn 

samples  from  each  unit 

selected  for  sampling,  and 
nt    =  total  number  of  analyses, 

including  replicates,  run  on 

all  samples. 

If  stratification  is  known  to  be  ab- 
sent, then  much  measurement  time 


and  effort  can  be  saved  by  combining 
all  the  samples  and  mixing  thoroughly 
to  produce  a  composite  sample  for 
analysis.  Equation  8  is  applicable  to 
this  situation  also.  If  the  units  vary 
significantly  in  weight  or  volume,  the 
results  for  those  units  should  be 
weighted  accordingly. 

For  homogeneous  materials  aw2  is 
zero,  and  the  second  term  on  the 
right-hand  side  of  Equation  8  drops 
out.  This  is  the  case  with  many  liquids 
or  gases.  Also,  if  all  units  are  sampled, 
then  nh  =  N  and  the  first  term  on  the 
right-hand  side  of  Equation  8  also 
drops  out. 

Particle  Size  in  Sampling 
Particulate  Mixtures 

Random  sampling  error  may  occur 
even  in  well-mixed  particulate  mix- 


tures if  the  particles  differ  appreciably 
in  composition  and  the  test  portion 
contains  too  few  of  them.  The  problem 
is  particularly  important  in  trace  anal- 
ysis, where  sampling  standard  devia- 
tions may  quickly  become  unaccepta- 
bly  large.  The  sampling  constant  di- 
agram of  Ingamells  and  the  Visman 
expression  are  useful  aids  for  estimat- 
ing sample  size  when  preliminary  in- 
formation is  available.  Another  ap- 
proach that  can  often  provide  insight 
is  to  consider  the  bulk  material  as  a 
two-component  particulate  mixture, 
with  each  component  containing  a  dif- 
ferent percentage  of  the  analyte  of  in- 
terest (//).  To  determine  the  weight 
of  sample  required  to  hold  the  sam- 
pling standard  deviation  to  a  prese- 
lected level,  the  first  step  is  to  deter- 
mine the  number  of  particles  n.  The 
value  of  n  may  be  calculated  from  the 
relation 

[didzl2  flOOfPj  -P2)l2  „ 

p(l-p) 


RP 


(9) 


where  d\  and  d-2  are  the  densities  of 
the  two  kinds  of  particles,  d  is  the 
density  of  the  sample,  Pi  and  P2  are 
the  percentage  compositions  of  the 
component  of  interest  in  the  two  kinds 
of  particles,  P  is  the  overall  average 
composition  in  percent  of  the 
component  of  interest  in  the  sample, 
R  is  the  percent  relative  standard 
deviation  (sampling  error)  of  the 
sampling  operation,  and  p  and  1  —  p 
are  the  fractions  of  the  two  kinds  of 
particles  in  the  bulk  material.  With 
knowledge  of  the  density,  particle 
diameter,  and  n,  the  weight  of  sample 
required  for  a  given  level  of  sampling 
uncertainty  can  be  obtained  through 
the  expression,  weight  =  (4/3)wr3dn 
(assuming  spherical  particles). 

Figure  4  shows  the  relation  between 
the  minimum  weight  of  sample  that 
should  be  taken  and  the  composition 
of  mixtures  containing  two  kinds  of 
particles,  one  containing  10%  of  the 
sought-for  substance  and  the  other  9, 
5,  1,  or  0%.  A  density  of  1,  applicable 
in  the  case  of  many  biological  materi- 
als, is  used,  along  with  a  particle  diam- 
eter of  0.1  mm.  If  half  the  particles  in 
a  mixture  contain  10%  and  the  other 
half  9%  of  the  substance  of  interest, 
then  a  sample  of  0.0015  g  is  required  if 
the  sampling  standard  deviation  is  to 
be  held  to  a  part  per  thousand.  If  the 
second  half  contains  5%,  a  sample  of 
0.06  g  is  necessary;  if  1%,  0.35  g  would 
be  needed.  In  such  mixtures  it  is  the 
relative  difference  in  composition  that 
is  important.  The  same  sample 
weights  would  be  required  if  the  com- 
positions were  100%  and  90,  50,  or 
10%,  or  if  they  were  0.1%  and  0.09, 
0.05,  or  0.01%.  The  same  curves  can  be 
used  for  any  relative  composition  by 
substitution  of  x  for  10%,  and  0.1  x, 
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0.5  x,  and  0.9  x      the  curves  corre- 
sponding to  1,  5,  and  9%  in  Figure  4.  If 
a  standard  deviation  of  1%  is  accept- 
able, the  samples  can  be  100  times 
smaller  than  for  0.1%. 

An  important  point  illustrated  by 
the  figure  is  that  if  the  fraction  of 
richer  particles  is  small,  and  the  leaner 
ones  contain  little  or  none  of  the  sub- 
stance of  interest,  large  test  portions 
are  required.  If  a  sample  of  gold  ore 
containing  0.01%  gold  when  ground  to 
140  mesh  (0.1  mm  in  diameter)  con- 
sists, say,  of  only  particles  of  gangue 
and  of  pure  gold,  test  portions  of  30  g 
would  be  required  to  hold  the  sam- 
pling standard  deviation  to  1%.  (An 
ore  density  of  3  is  assumed.) 

Concluding  Comments 

Sampling  is  not  simple.  It  is  most 
important  in  the  worst  situations.  If 
the  quantities  x,  s,  Ks,  A,  and  B  are 
known  exactly,  then  calculation  of  the 
statistical  sampling  uncertainty  is 
easy,  and  the  number  and  size  of  the 
samples  that  should  be  collected  to 
provide  a  given  precision  can  be  readi- 
ly determined.  But  if,  as  is  more  usual, 
these  quantities  are  known  only  ap- 
proximately, or  perhaps  not  at  all, 
then  preliminary  samples  and  mea- 
surements must  be  taken  and  on  the 
basis  of  the  results  more  precise  sam- 
pling procedures  developed.  These 
procedures  will  ultimately  yield  a 
sampling  plan  that  optimizes  the  qual- 
ity of  the  results  while  holding  down 
time  and  costs. 

Sampling  theory  cannot  replace  ex- 
perience and  common  sense.  Used  in 
concert  with  these  qualities,  however, 
it  can  yield  the  most  information 
about  the  population  being  sampled 
with  the  least  cost  and  effort.  All  ana- 
lytical chemists  should  know  enough 
sampling  theory  to  be  able  to  ask  in- 
telligent questions  about  the  samples 
provided,  to  take  subsamples  without 
introducing  additional  uncertainty  in 
the  results  and,  if  necessary,  to  plan 
and  perform  uncomplicated  sampling 
operations.  It  is  the  capability  of  un- 
derstanding and  executing  all  phases 
of  analysis  that  ultimately  character- 
izes-the  true  analytical  chemist,  even 
though  he  or  she  may  possess  special 
expertise  in  a  particular  separation  or 
measurement  technique. 
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ABSTRACT:  The  role  of  reference  materials  in  monitoring  the  chemical 
measurement  process  is  considered.  Requirements  for  reliable  reference 
materials  are  reviewed.  The  use  of  reference  material  data  in  estimating 
the  uncertainties  of  the  results  of  measurements  on  test  samples  is 
discussed. 
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The  increasing  requirements  for  accuracy  in  chemical  analysis 
and  the  necessity  to  interrelate  and  combine  data  sets  from  several 
laboratories  or  from  the  same  laboratory  over  intervals  of  time  have 
created  a  demand  for  well-characterized  reference  materials  (RMs) 
for  quality  assurance  purposes.  When  properly  used,  such  materials 
can  provide  a  high  degree  of  confidence  in  analytical  data. 

On  the  basis  of  inquiries  received  and  discussions  with  various 
analysts,  it  is  clear  that  RMs  are  not  fully  used  in  some  situations  and 
are  poorly  used  in  others.  It  is  the  purpose  of  this  paper  to  clarify  the 
role  of  RMs  in  the  chemical  measurement  process  and  to  suggest 
ways  in  which  they  may  be  used  to  the  best  advantage. 

Role  of  Reference  Materials 

An  RM  is  any  substance  that  may  be  measured  simultaneously  or  se- 
quentially in  a  measurement  process  to  provide  information  about 
the  process  or  the  measurements  arising  therefrom.  RMs  may  be  in- 
ternally developed  to  monitor  a  specific  measurement  process,  ot 
they  may  be  provided  by  an  externa!  source.  Externally  developed 
reference  materials  (ERMs)  are  usually  certified  by  some  organiza- 
tion and  frequently  called  certified  reference  materials  (CRMs)  [/]. 
The  National  Bureau  of  Standards  (NBS)  Standard  Reference  Ma- 
terials (SRMs)  are  a  special  class  of  CRMs  that  have  been  carefully- 
Presented  at  the  Symposium  on  Reference  Materials  and  Their  Use  in 
Nuclear  Fuei  Cycle,  sponsored  by  ASTM  Committee  C-26  on  Nuclear  Fuel 
Cycle  on  9  Aug!  1982.  Knoxville.'TN. 

'Coordinator  tor  quality  assurance  and  voluntary  standardization. 
Center  for  Analytical  Chemistrv.  National  Bureau  of  Standards.  Washing- 
ton. DC  20234. 


analyzed  (sometimes  with  the  aid  of  cooperators)  and  certified  by 
NBS  [2-4].  While  the  RMs  described  above  may  differ  in  status, 
each  can  provide  useful  information  to  the  analyst  for  the  assess- 
ment of  data  quality. 

The  only  rational  basis  for  use  of  a  RM  is  as  a  monitor  of  a  mea- 
surement system  that  is  in  a  state  of  statistical  control  [5].  This 
means  that  a  valid  measurement  principle  has  been  identified  and 
put  into  practice  using  quality  control  procedures  that  assure  a  req- 
uisite degree  of  reproducibility.  Indeed,  such  a  measurement  pro- 
cess should,  conceptually,  be  capable  of  producing  an  infinite  pop- 
ulation of  measurements,  some  of  which  at  any  moment  may  be 
considered  as  a  sample.  The  measurements  of  the  RM  may  then  be 
considered  as  random  samplings  of  the  output  of  the  measurement 
system,  which  would  permit  their  interpretation  for  evaluation  of 
the  measurement  process. 

Requirements  for  RMs 

RMs  of  any  type  must  be  appropriate  in  matrix  and  composition 
and  of  stable  composition  over  the  intended  period  of  use.  They 
must  be  sufficiently  uniform  in  composition  when  subsampied  (ho- 
mogeneous) and  available  in  sufficient  quantity  to  be  usefui  over  a 
reasonable  period  of  time. 

CRMs  have  the  further  requirement  that  they  must  be  issued 
with  a  certificate  in  which  their  measured  parameters  and  assigned 
uncertainties  are  fully  documented  [6].  Internal  reference  materials 
(IRMs)  must  have  an  equal  degree  of  reliability  with  respect  to  sta- 
bility and  homogeneity.  The  requirement  for  accuracy  of  the  as- 
signed values  of  specified  parameters  of  the  latter  will  depend  upon 
the  end  use  of  the  materials,  but  this  may  be  of  lesser  importance 
than  homogeneity  in  some  cases. 

Laboratories  are  well  advised  to  upgrade  the  accuracy  of  their 
IRMs  to  the  highest  extent  possible.  Intcrcomparisons  with  high 
quality  CRMs,  such  as  SRMs.  can  be  used  to  accomplish  this  pur- 
pose. 

Interpretation  of  RM  Data 

The  primary  function  cf  IRMs  (often  called  control  samples)  is  to 
evaluate  the  attainment  of  statistical  control  of  the  measurement 
system.  As  long  as  such  samples  are  stable  and  homogeneous  in 
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composition,  the  precision  observed  in  the  analysis  of  the  IRM  may 
be  inferred  as  the  precision  of  the  measurement  system  for  that  par- 
ticular measurement.  Accordingly,  it  is  clear  that  stability  and  uni- 
formity are  the  prime  requirements  of  such  materials  and  that  accu- 
rate knowledge  of  compositions  is  only  a  secondary  consideration. 
While  such  information  is  highly  desirable,  the  proof  of  absence  of 
bias  using  such  samples  may  be  difficult  to  establish  based  on  inter- 
nally generated  evidence  alone. 

ERMs  and  CRMs  in  particular  are  best  used  to  demonstrate  ac- 
curacy, that  is,  the  freedom  from  bias,  of  measurement  systems  that 
are  demonstrated  to  be  in  a  state  of  statistical  control.  Because  the 
CRM  may  be  costly  or  available  in  limited  amounts  or  both,  this  lat- 
ter demonstration  is  often  best  left  to  the  use  of  IRMs.  Because  of 
their  high  quality  and  the  care  used  in  their  certification,  SRMs 
often  stand  at  the  top  of  the  RM  hierachy  and,  hence,  are  especially 
useful  to  evaluate  the  accuracy  of  a  measurement  process.  Needless 
to  say.  SRMs  should  be  used  in  carefully  designed  test  sequences 
together  with  IRMs  if  maximum  information  is  to  be  provided. 

Figure  1  contains  a  typical  sequence  in  which  IRMs  and  SRMs 
may  be  measured  together  with  the  test  samples  to  monitor  a  mea- 
surement process.  Figure  1  assumes  that  control  charts  are  main- 
tained [7],  the  kinds  of  which  are  indicated  in  the  "Notes."  The  con- 
trol limits  of  the  charts  are  determined  from  the  results  of  previous 
measurements  of  the  IRM.  Points  on  the  control  charts  are  plotted 
immediately  after  the  data  are  obtained,  and  they  must  fall  within 
the  control  limits,  at  the  decision  points,  in  order  to  continue  the 
measurements  sequence.  Otherwise  any  measurements  obtained 
since  the  last  time  the  system  was  known  to  be  in  control  are  suspect 
and  discarded  or  held  in  abeyance.  Furthermore,  the  system  must 
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FIG.  1 — Quality  assessment  using  IRM  samples. 


be  demonstrated  to  have  regained  control  before  data  may  again  be 
accepted. 

Figure  2  is  similar  to  Fig.  1  but  uses  duplicate  or  split  samples  to 
monitor  statistical  control.  Control  charts  based  on  the  differences 
between  the  results  obtained  for  the  duplicate/split  samples  are  the 
basis  for  monitoring  statistical  control.  Such  charts  are  described  in 
several  publications  [7].  The  rationale  to  be  followed  at  the  deci- 
sion points  is  essentially  the  same  as  described  in  the  discussion  of 
Fig.  1. 

No  matter  what  kinds  of  RMs  are  used,  their  ability  to  monitor  the 
measurement  process  and  especially  the  measurements  In  progress 
must  be  demonstrated,  and  this  is  often  a  matter  of  inference. 
Though  statistical  control  of  the  measurement  system  and  freedom 
from  bias  may  be  readily  demonstrated  for  the  case  of  measurement 
of  RMs,  the  performance  of  the  system-Ton  measurements  of  other 
test  samples  is  the  matter  of  concern.  To  the  extent  that  the  RMs 
simulate  the  test  samples,  the  inference  drawn  from  measurement  of 
the  former  may  be  transferred  to  the  latter.  Conversely,  the  con- 
fidence may  diminish  as  the  degree  of  simulation  is  decreased.  In 
every  case,  the  experience  and  professional  judgment  of  the  analyst 
must  be  used  to  infer  how  well  an  RM  monitors  the  actual  measure- 
ment process. 

RMs  may  also  be  used  to  evaluate  the  suitability  of  a  proposed 
method  for  a  special  purpose  or  to  determine  the  performance  char- 
acteristics of  methods  under  development.  Such  use  of  RMs  has 
much  the  same  limitations  as  in  the  evaluation  of  monitoring  a  pro- 
cess. As  stable  test  samples,  RM  analyses  can  provide  data  on  the 
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precision  of  a  method  of  measurement.  How  well  this  may  be  trans- 
ferred to  a  practical  measurement  situation,  and  how  well  potential 
biases  are  evaluated  is  again  a  matter  of  judgment,  which  may  need 
to  be  supported  by  additional  information. 

Conclusion 

Laboratories  should  refrain  from  reporting  data  unless  they  are  in 
a  position  to  assign  uncertainties  to  the  reported  values  [<*].  Such  an 
assignment  requires  the  attainment  of  statistical  control  of  the  mea- 
surement system  and  estimation  of  the  bounds  of  systematic  error. 
The  analysis  of  reliable  RMs  in  a  planned  measurement  sequence 
can  provide  the  basis  of  estimating  both  the  random  and  systematic 
components  of  the  measurement  uncertainty.  However,  the  assign- 
ment of  such  uncertainties  to  the  test  results  must  be  done  with  due 
consideration  of  any  matrix  differences  between  the  RMs  and  the 
test  samples.  The  analysis  of  high  quality  ERMs,  such  as  SRMs,  to- 
gether with  the  laboratory's  IRMs  (control  samples)  is  the  best  ap- 
proach to  monitoring  a  chemical  measurement  system  for  quality  as- 
surance of  the  data  output. 
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^klidatioim  of 
Analytical  Methods 


Validation  of  analytical  methods  is 
a  subject  of  considerable  interest. 
Documents  such  as  the  "ACS  Guide- 
lines for  Data  Acquisition  and  Data 
Quality  Evaluation"  (1)  recommend 
the  use  of  validated  methods.  The 
promulgation  of  federal  environmen- 
tal regulations  requires  the  inclusion 
of  validated  reference  methods.  Stan- 
dards-writing organizations  spend 
considerable  time  in  collaborative 
testing  of  methods  they  prepare,  vali- 
dating them  in  typical  applications 
and  determining  their  performance 
characteristics.  Nevertheless,  ques- 
tions about  the  appropriateness  of 
methods  and  the  validity  of  their  use 
in  specific  situations  often  arise.  Some 
of  these  questions  may  be  due  to  dif- 
ferences, in  understanding  both  what  a 
method  really  is  and  what  the  signifi- 
cance of  the  validation  process  is.  This 
paper  attempts  to  clarify  the  nomen- 
clature of  analytical  methodology  and 
to  define  the  process  of  validating 
methods  for  use  in  specific  situations. 

Hierarchy  of  Methodology 

The  hierarchy  of  methodology,  pro- 
ceeding from  the  general  to  the  specif- 
ic, may  be  considered  as  follows: 
technique  — ■  method  —*  procedure  — * 
protocol. 

A  technique  is  a  scientific  principle 
that  has  been  found  to  be  useful  for 
providing  compositional  information; 
spectrophotometry  is  an  example.  An- 
alytical chemists  historically  have  in- 
vestigated new  measurement  tech- 
niques for  their  ability  to  provide 
novel  measurement  capability,  or  to 
replace  or  supplement  existing  meth- 
odology. As  a  result  of  innovative  ap- 
plications, analysts  can  now  analyze 

This  REPORT  is  based  on  a  talk  given  at  the 
184th  ACS  National  Meeting,  Sept.  12-17,  1982, 
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for  myriad  substances  in  exceedingly 
complex  mixtures  at  ever  lower  trace 
levels,  with  precision  and  accuracy  un- 
dreamed of  only  a  few  years  ago  (2). 

A  method  is  a  distinct  adaptation 
of  a  technique  for  a  selected  measure- 
ment purpose.  The  pararosaniline 
method  for  measurement  of  sulfur 
dioxide  is  an  example.  It  involves  mea- 
suring the  intensity  of  a  specific  dye, 
the  color  of  which  is  "bleached"  by  the 
gas.  Several  procedures  for  carrying 
out  this  method  may  be  found  in  the 
literature.  Modern  methodology  is 


complex  and  may  incorporate  several 
measurement  techniques;  a  method 
may  thus  be  interdisciplinary. 

A  procedure  consists  of  the  written 
directions  necessary  to  utilize  a  meth- 
od. The  "standard  methods"  devel- 
oped by  ASTM  and  AOAC  are,  in  re- 
ality, standardized  procedures.  ASTM 
D2914— Standard  Test  Method  for 
the  Sulfur  Dioxide  Content  of  the  At- 
mosphere (West-Gaeke  Method) — is 
an  example  {3).  While  a  precise  de- 
scription is  the  aim,  it  is  difficult,  if 
not  impossible,  to  describe  every  de- 


Hierarchy  of  Analytical  Methodology 

Definition 

Example 

Technique 

Scientific  principle 
useful  for  providing 
compositional  infor- 
mation 

Spectrophotometry 

Method 

Distinct  adaptation  of 
a  technique  for  a  se- 
lected measurement 
purpose 

Pararosaniline  method 
for  measurement  of 
sulfur  dioxide 

Procedure 

Written  directions 
necessary  to  use  a 
method 

ASTM  D2914— -  Standard 
Test  Method  for  the 
Sulfur  Dioxide  Content  of 
the  Atmosphere  (West- 
Gaeke  Method) 

Protocol 

Set  of  definitive 
directions  that  must  be 
followed,  without  excep- 
tion, if  the  analytical 
results  are  to  be 
accepted  for  a  given 
purpose 

EPA  Reference  Method 
for  the  Determination 
of  Sulfur  Dioxide 
in  the  Atmosphere 
(Pararosaniline  Method) 
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tail  of  every  operation  in  a  procedure. 
Accordingly,  some  level  of  sophistica- 
tion is  presumed  for  the  user  of  every 
published  procedure;  if  very  sophisti- 
cated users  are  contemplated,  only  a 
minimum  of  detail  will  be  provided 
and  vice  versa.  However,  it  should  be 
noted  that  any  omission  in  the  de- 
scription of  critical  steps  is  a  potential 
source  of  variance  or  bias,  even  in  the 
hands  of  knowledgeable  analysts.  Be- 
cause of  the  flexibility  intentionally  or 
unintentionally  provided  to  the  ana- 
lyst, or  because  of  differences  in  inter- 
pretation, it  is  fair  to  say  that  minor- 
to-major  differences  of  application 
occur  in  the  use  of  even  the  most  pre- 
cisely defined  procedures.  Such  differ- 
ences often  account  for  the  interlabo- 
ratory  variability  observed  in  many 
collaborative  tests.  Further,  at  some 
point  of  departure  from  a  published 
procedure,  a  new  method  results  that 
may  need  its  own  validation. 

The  term  protocol  is  the  most  spe- 
cific name  for  a  method.  A  protocol  is 
a  set  of  definitive  directions  that  must 
be  followed,  without  exception,  if  the 
analytical  results  are  to  be  accepted 
for  a  given  purpose.  Protocols  may 
consist  of  existing  methods  or  proce- 
dures, modifications  of  such,  or  they 
may  be  developed  especially  for  spe- 
cific purposes.  Typically,  they  axe  pre- 
scribed by  an  official  body  for  use  in  a 
given  situation  such  as  a  regulatory 
process.  The  EPA  Reference  Method 
for  the  Determination  of  Sulfur  Diox- 
ide in  the  Atmosphere  (Pararosaniline 
Method)  is  an  example  of  a  protocol 
(4).  The  test  method  specified  as  part 
of  a  contractual  arrangement  for  the 
acceptance  of  data  or  a  product  or  ma- 
terial is  another  example  of  a  protocol, 
although  it  may  not  be  called  that  in 
the  contract. 

A  plethora  of  methods,  procedures, 


Figure  1.  Basic  concept  of  the  validation  process 


and  protocols  based  on  the  same  mea- 
surement principle  can  arise  for  a 
given  analytical  determination.  Usual- 
ly, they  are  worded  differently,  and 
they  may  contain  subtle  or  major  dif- 
ferences in  technical  detaiis.  The  ex- 
tent to  which  each  needs  to  be  individ- 
ually validated  is  a  matter  of  profes- 
sional judgment.  It  is  evident  that 
some  validation  tests  could  be  merely 
a  matter  of  experimentally  testing  the 
clarity  of  the  written  word. 

Goals  for  Validation 

Validation  is  the  process  of  deter- 
mining the  suitability  of  methodology 
for  providing  useful  analytical  data. 
This  is  a  value  judgment  in  which  the 
performance  parameters  of  the  meth- 
od are  compared  with  the  require- 
ments for  the  analytical  data,  as  illus- 


trated in  Figure  1.  Obviously  a  method 
that  is  valid  in  one  situation  could  be 
invalid  in  another.  Accordingly,  the 
establishment  of  firm  requirements 
for  the  data  is  a  prerequisite  for  meth- 
od selection  and  validation.  When 
data  requirements  are  ill-considered, 
analytical  measurement  can  be  unnec- 
essarily expensive  if  the  method  cho- 
sen is  more  accurate  than  required,  in- 
adequate if  the  method  is  less  accurate 
than  required,  or  utterly  futile  if  the 
accuracy  of  the  method  is  unknown. 

Fortunately,  typical  and  even  stan- 
dard measurement  problems  often 
exist.  Examples  include  a  wide  variety 
of  clinical  analyses,  environmental  de- 
terminations, and  recurring  measure- 
ments for  the  characterization  of  in- 
dustrial products.  The  kinds  of  sam- 
ples for  which  methods  have  been  val- 
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idated  should  be  clearly  described, 
and  users  should  be  aware  of  the  need 
to  demonstrate  their  own  abilities  to 
use  the  method  in  their  own  laborato- 
ries. 

Statements  of  precision  and  accura- 
cy are  often  a  result  of  a  validation 
process,  especially  in  the  case  of  a  col- 
laborative test  exercise.  Such  state- 
ments are  often  misinterpreted;  they 
merely  describe  the  results  of  the  ex- 
ercise and  are,  at  best,  estimates  of 
typical  performance  expectations  for 
the  method.  They  should  not  be  con- 
strued to  be  performance  parameters 
nor  should  they  be  used  to  estimate 
the  uncertainty  of  any  future  data  ob- 
tained by  using  the  method.  However, 
information  on  precision  and  accuracy 
should  be  obtained  to  the  extent  pos- 
sible since  it  provides  a  quantitative 
basis  forjudging  general  performance 
capability. 

Other  information  useful  for  char- 
acterizing methodology  or  forjudging 
its  suitability  for  a  given  use  includes: 
sensitivity  to  interferences,  limits  of 
detection,  and  useful  range  of  mea- 
surement. The  specific  details  for 
evaluating  methodology  in  these  re- 
spects are  beyond  the  scope  of  the 
present  paper.  Ordinarily,  such  infor- 
mation is  best  obtained  as  a  result  of 
applied  research  during  the  method 
development  stage.  Because  the  limit 
of  detection  is  closely  related  to  the 
attainable  precision  at  the  lower  limit 
of  measurement,  both  the  limit  of  de- 
tection and  the  lowest  concentration 
range  measurable  (often  called  limit  of 
quantitation)  should  be  evaluated,  as 
pertinent,  in  every  laboratory  (i ,  5). 

Validation  Process 

The  validation  process  verifies  that 
the  methodology  is  based  on  sound 
technical  principles  and  that  it  has 
been  reduced  to  practice  for  practical 
measurement  purposes.  Both  the  need 
to  validate  methodology  and  the  pro- 
cedure to  be  followed  are  matters  for 
professional  judgment.  The  validation 
can  be  either  general  or  specific. 

General  Validation.  Validation  of 
measurement  techniques  depends  on 
the  elucidation  of  the  scientific  princi- 
ples upon  which  they  are  based.  Such 
validation  results  from  the  research  of 
the  scientific  community,  and  its 
soundness  is  evaluated  by  peer  review. 
Better  understanding  of  measurement 
principles  can  extend  their  scope  and 
improve  the  quality  of  their  use.  To 
confirm  the  above  statement,  one 
need  only  think  about  the  varied  re- 
search that  has  contributed  to  the  un- 
derstanding of  the  principles  of  gas 
chromatography  and  that  has  led  to 
development  of  its  status  as  a  prime 
measurement  technique. 

Methods  arise  as  the  result  of  ap- 
plied research,  typically  by  individu- 
als, that  often  involves  both  a  compre- 


hensive understanding  of  measure- 
ment techniques  and  a  high  degree  of 
ingenuity  and  innovation  in  their  ap- 
plication. Testing  of  the  methods  in 
typical  practical  situations  plays  a  key 
role  in  both  the  development  process 
and  in  validation.  While  ordinarily 
limited  in  scope,  validation  at  the  re- 
search stage  can  be  comprehensive 
and  can  apply  to  a  wide  variety  of  end 
uses. 

Procedures  are  developed  for  the 
end  use  of  methods  in  practical  ana- 
lytical situations.  The  user  laboratory 
ordinarily  needs  more  experimental 
details  than  are  contained  in  a  pub- 
lished research  report  of  a  method  to 
use  it  in  practical  measurements.  Fre- 
quently, as  a  method  gains  widespread 
use,  procedures  evolve  that  the  users 
may  decide  need  to  be  standardized. 
This  is  often  done  by  consensus  in  a 
standards  organization  forum.  During 
this  process,  the  resulting  standard 
procedure  is  examined  both  technical- 
ly and  editorially.  A  thorough  review 
process  includes  collaborative  testing 
in  which  typical  stable  test  materials 
are  analyzed  to  verify  the  procedure's 
usefulness  and  to  identify  both  techni- 
cal and  editorial  weaknesses.  The  pro- 
cess is  illustrated  in  Figure  2.  If  the 
composition  of  the  reference  samples 
is  known,  precision  and  bias,  both 
intra-  and  interlaboratory,  can  be 
evaluated;  otherwise,  only  precision 
can  be  evaluated.  If  a  method  of 
known  accuracy  is  available,  the  col- 
laborative test  may  consist  of  its  com- 
parison with  the  candidate  method,  in 
which  case  both  precision  and  bias  can 
be  evaluated.  The  performance  pa- 
rameters of  the  procedure  so  evalu- 
ated are  for  the  conditions  of  the  col- 
laborative test  that  are  considered 
typical.  Any  extension  of  them  to 
other  kinds  of  samples  is  by  inference 
only,  and  may  need  to  be  justified.  Al- 
though it  can  be  time-consuming,  the 
development  of  a  standard  method  is 
one  of  the  best  ways  to  validate  a  pro- 
cedure because  of  the  breadth  of  ex- 
amination that  is  involved. 

A  protocol  is  prescribed  by  fiat  of  an 
organization  requiring  a  specific  kind 
of  measurement.  Presumably  it  results 
from  an  intelligent  decision  based  on 
the  organization's  vaJidation  process 
or  that  of  others.  This  may  consist  of 
an  extensive  collaborative  test  or  pub- 
lication of  a  proposed  protocol  for 
public  comment.  Unfortunately,  expe- 
diency has  overruled  sound  scientific 
judgment  in  some  cases,  resulting  in 
the  promulgation  of  unvalidated  and 
scientifically  defective  protocols  (6). 
Protocols  that  are  specified  in  a  con- 
tractual arrangement  may  be  selected 
arbitrarily  or  through  a  well-conceived 
selection  process.  Verification  of  their 
validity  for  the  specific  use  should  be 
a  prime  consideration. 
Validation  for  Specific  Use.  The 
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Figure  2.  Collaborative  test  process 


ultimate  use  of  analytical  methodolo- 
gy is  to  produce  compositional  infor- 
mation about  specific  samples  neces- 
sary for  the  solution  of  particular 
problems  ranging  from  exotic  research 
investigations  to  the  very  mundane. 
The  selection  of  appropriate  measure- 
ment methodology  is  often  a  major 
consideration.  Methods  or  procedures, 
even  if  previously  validated  in  general 
terms,  cannot  unequivocally  be  as- 
sumed to  be  valid  for  the  situation  in 
hand,  because  of  possible  differences 
in  sample  matrix  and  other  consider- 
ations. Professional  analytical  chem- 
ists traditionally  have  recognized  this 
and  their  responsibility  to  confirm  or 
prove  (if  necessary)  both  the  validity 
of  the  methodology  used  for  specific 
application  (2)  and  their  own  ability 
to  reduce  it  to  practice. 

The  classical  validation  process  is  il- 
lustrated in  Figure  3.  When  reference 
samples  are  available  that  are  similar 
in  all  respects  to  the  test  samples,  the 
process  is  very  simple:  It  consists  of 
analyzing  a  sufficient  number  of  refer- 
ence samples  and  comparing  the  re- 
sults to  the  expected  or  certified  val- 
ues (7).  Before  or  during  such  an  exer- 
cise, the  analyst  must  demonstrate  the 
attainment  of  a  state  of  statistical  con- 
trol of  the  measurement  system  (8)  so 
that  the  results  can  be  relied  upon  as 
representative  of  those  expected  when 
using  the  methodology-measurement 
system. 


When  a  suitable  reference  material 
is  not  available,  several  other  ap- 
proaches are  possible.  One  consists  of 
comparing  the  results  of  the  candidate 
method  with  those  of  another  method 
known  to  be  applicable  and  reliable, 
but  not  useful  in  the  present  situation 
because  of  cost,  unavailability  of  per- 
sonnel or  equipment,  or  other  reasons. 
Even  the  agreement  of  results  with 
those  obtained  using  any  additional 
independent  method  can  provide 
some  useful  information. 

Spiked  samples  and  surrogates  may 
be  used  as  reference  samples.  This  ap- 
proach is  less  desirable  and  less  satis- 
factory because  of  the  difficulty  in  the 
reliable  preparation  of  such  samples 
and  because  artificially  added  materi- 
als such  as  spikes  and  surrogates  may 
exhibit  matrix  effects  differing  from 
those  of  natural  samples.  Split  sam- 
ples of  the  actual  test  samples  may  be 
used  to  evaluate  the  precision  of  a 
method  or  procedure,  but  they  pro- 
vide no  information  about  the  pres- 
ence or  magnitude  of  any  measure- 
ment bias. 

Another  approach  is  to  infer  the  ap- 
propriateness of  methodology  from 
measurements  on  analogous  but  dis- 
similar reference  materials.  The  criti- 
cal professional  judgment  of  the  ana- 
lyst is  necessary  to  decide  the  validity 
of  the  inference. 

In  all  cases,  sufficient  tests  must  be 
made  to  evaluate  the  methodology  for 
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the  variety  of  matrices  and  ranges  of 
composition  expected  during  the  mea- 
surement process.  Ordinarily,  the  lat- 
ter should  include  three  levels  of  con- 
centration, namely,  the  extremes  and 
the  mid-range  of  compositions  expect- 
ed. Statistical  considerations  suggest 
that  at  least  six  degrees  of  freedom 
(ordinarily  seven  measurements) 
should  be  involved  at  each  decision 
point. 

Conclusion 

A  valid  method  is  necessary  but  not 
sufficient  for  the  production  of  valid 
data.  Most  methods  require  a  degree 
of  skill  on  the  part  of  the  analyst;  this 
skill  constitutes  a  critical  factor  in  the 
measurement  process.  It  is  common 
knowledge  that  data  obtained  by  sev- 
eral laboratories  on  the  same  test  sam- 
ple using  the  same  methodology  may 
show  a  high  degree  of  variability.  The 
alleviation  of  such  a  problem  is  in  the 
area  of  quality  assurance  of  the  mea- 
surements (S).  Data  obtained  by  a 
valid  method  used  in  a  well-designed 
quality  assurance  program  should 
allow  the  assignment  of  limits  of  un- 
certainty that  can  be  used  to  judge  the 
data's  validity. 

It  should  be  remembered  that  the 
validity  of  any  data  will  also  depend 
upon  the  validity  of  the  model  and  of 
the  sample  (<S,  9).  The  model  repre- 
sents the  conceptualization  of  the 


problem  to  be  solved,  describes  the 
samples  that  should  be  analyzed,  the 
data  base  required,  and  the  way  the 
model  will  be  utilized.  Obviously,  even 
flawless  measurement  data  will  be  of 
little  value  if  the  basic  concepts  are 
faulty.  Likewise  the  samples  analyzed 
must  be  valid  if  the  results  obtained 
for  them  are  to  be  intelligently  inter- 
preted. 

The  key  role  of  reliable  reference 
materials  in  the  validation  of  analyti- 
cal measurements  cannot  be  overem- 
phasized. Their  use  in  validating  the 
methodology  has  already  been  dis- 
cussed. A  planned  sequential  analysis 
of  reference  materials  in  a  quality  as- 
surance program  can  assess  the  quali- 
ty of  the  data  output  and  thus  validate 
the  overall  aspects  of  the  analytical 
measurement  system  (7). 
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